AI Agent Engineer
SoBanHang
Software Engineering, Data Science
Ho Chi Minh City, Vietnam
Working hours: Monday - Friday: 8h30 - 18h00, lunch break: 12h00 - 13h30; Saturday: 8h30 - 12h00
Working place: 173 Tran Nao, An Khanh Ward, Thu Duc City, Ho Chi Minh City
About us
Finan Pte Ltd is a fintech company on a mission to make it easier for businesses to run and grow. We build AI-powered solutions that connect growth, operations, finance, suppliers, and partners — helping businesses work smarter, grow faster, and enabling partners to serve them more efficiently.
Our two flagship products reflect this vision:
- SoBanHang is the #1 business app in Vietnam that helps shop owners manage sales, inventory, receivables, and digital financial tools — all from a mobile phone.
- FinanBook is the modern finance and banking platform that helps SMEs control spending, manage cash flow, and grow smarter.
To date, Finan has supported over 650,000 businesses and processed more than $3 billion in annual transactions, with an average app rating of 4.8★ from over 60,000 users.
About the Product
We're building an AI-native operating platform — multi-tenant, multi-region, multi-vertical. This isn't "bolting AI onto an existing product." We're rebuilding from the ground up with AI as the core operating layer, serving multiple industries across multiple countries, with high transaction volume and strict accuracy/audit requirements.
The AI layer is the core value driver, not a side feature:
- Outcome agents — measured by real business results, billed pay-per-result
- Copilot streaming chat with rich content blocks (text / table / chart / action proposals / citations) + tool-use orchestration
- Voice-to-form with per-form vocabulary tuning for field workers
- Workflow Engine running AI activities as service tasks (decision, dispatch, webhook)
- Rule-first → Agent fallback for high-volume matching and classification pipelines
What You'll Do
- Own a product capability end-to-end: business objective → spec → architecture → build → deploy → measure
- Design and ship AI agents with clear success signals tied to measurable business outcomes
- Orchestrate tool-use on leading LLM APIs: define tool schemas, manage conversation state, handle streaming, prompt caching, per-tenant token budgets
- Build Copilot streaming chat that returns rich content blocks — integrated with BFF, rendered as action buttons / tables / charts on the frontend
- Integrate workflow orchestration for long-running agents (multi-step, retry/compensation, human approval gates)
- Measure and optimize: cost per outcome, p95 latency, success rate, hallucination rate, prompt cache hit rate — every agent ships with a dashboard
- Build RAG pipelines for tenant knowledge bases
- Implement guardrails: tenant isolation, PII redaction, capability gating, full audit trails
- Write technical specs before writing code: BRD, data model, API contract, ADR
How We Work
A few things that are genuinely different here:
Spec before code. Every feature starts with a written plan — brainstorm, writing plan, then execution. We don't figure it out as we go.
Eval-driven AI. Every prompt change ships with before/after evals. We don't deploy on vibes.
Verify before done. "Seems to work" doesn't count. Tests run, outputs checked, then it's done.
Multi-vertical by default. Every design decision gets stress-tested against one question: "What if a tenant from a different industry or country onboards tomorrow?" No vertical-specific assumptions baked in.
Modules first, services later. New domains default to a module inside an existing service. We only split when the boundary is clear enough to justify it.
Code review that actually means something. Automated toolchain plus peer review. Nothing merges unreviewed.
Your Skills and Experience
Must-have
- 5+ years as a senior software engineer — building and maintaining real production systems, not just prototypes
- 2+ years working with LLM APIs in production — live systems, real users, real incidents you've debugged and fixed
- Fluent in Python or Go, comfortable writing both
- TypeScript + React — enough to build Copilot UI, streaming renders, and action block components
- Deep understanding of tool use / function calling: schema design, conversation loops, error handling, streaming
- Disciplined prompt engineering: versioning, evals, A/B testing — not just "write a prompt until it works"
- Prompt caching: know when you hit vs miss, and how to structure prompts to maximize cache hit rate
- Strong SQL: transactions, indexes, row-level security, query optimization; understand OLTP vs OLAP
- Ability to design service-level architecture: data flow, API contracts, storage models, integration points — and articulate trade-offs clearly
- Experience integrating external APIs with quirks: schema drift, idempotency, retry logic, reconciliation between sources of truth
- Understand multi-tenancy: tenant isolation, per-tenant quotas, capability gating at the DB and application layer
- Can write technical specs clear enough for others to implement without hand-holding
- Outcome-driven mindset: success means the business outcome is achieved, not that the feature shipped
Not a fit if
- You've only built side projects or hackathon demos — no production systems with real users
- You only own the AI layer and can't take responsibility for backend or frontend
- You think "AI agent" means longer prompts or chaining more LLM calls
- You don't have an eval framework — you test by chatting and seeing if it "looks right"
- You're not comfortable reading long specs (avg 500–2,000 lines) before writing code
- You can't commit to full-time onsite in Ho Chi Minh City
Why You'll Love Working Here
- Genuinely hard technical problems — you're not wrapping ChatGPT. You're building AI agents measured by real business outcomes on a complex multi-tenant platform
- Real ownership — from spec to ship to maintain, not broken into mechanical tickets
- Spec-first, no pointless meetings — flexible working hours, focused on output
- Personal LLM API budget for learning and experimentation
- Access to our full internal AI tooling stack
- Hardware support — machine and monitor provided at the office
- Competitive compensation by level (Senior / Staff) — discussed after Round 1
- A team that's serious about engineering — real code reviews, eval-driven development, no deploying on gut feeling