Available for new data expeditions

Architect of the Lakehouse.

I turn chaotic, web-scraped data into strategic power — forging raw JSON through Bronze, Silver, and Gold tiers, and summoning RAG-augmented AI agents that automate the mundane and divine actionable truth.

DatabricksDelta LakePySparkPolarsDuckDBLangGraphAirflowDocker
MEDALLION PIPELINE● live
Bronze
raw scraped JSON
Silver
cleaned & conformed
Gold
analytics-ready
+34.2%
data match-rate lift
−32%
non-performing loans
42M
rows on one node
$5.9M
cost saved

My Journey

As a dual-classed Mechatronics Engineer and Master of Finance, I operate as an Architect of the Lakehouse. I transform chaotic data streams—like untamed web-scraped JSON—into strategic power through the Bronze, Silver, and Gold tiers of the Medallion architecture.

Armed with Python, Polars, and Object-Oriented methodologies, I summon autonomous, RAG-augmented AI systems that navigate enterprise labyrinths. Operating from my remote stronghold in Guatemala—and fueled by meticulously dialed-in espresso—I bridge complex business strategy and resilient infrastructure.

Technical Grimoire

  • Databricks & Delta Lake
  • Python (Pandas/Polars)
  • RAG & Agentic AI (LangGraph)
  • ETL/ELT & Medallion Arch
  • DuckDB & Cloudflare R2
  • Docker & CI/CD Pipelines

Forged Projects

Market Intelligence & AI Co-host

Lead Data & AI Engineer

End-to-end data platform optimizing short-term rental performance. Integrated Dockerized scrapers, a Medallion Data Lakehouse via DuckDB, and an Autonomous AI Co-host agent using LangGraph and Gemini 1.5 Pro with a RAG pipeline.

Delta LakeDuckDBLangGraphGemini ProCloudflare

Market Insight Data Platform

Data Engineer / Business Analyst @ Halo/Mercer

Architected a Medallion Lakehouse on AWS Databricks. Engineered a single-node out-of-core pipeline using Python and Polars to bypass memory bottlenecks and process 65GB datasets locally.

AWS DatabricksPython/PolarsApache AirflowUnity Catalog

Credit Risk ML Engine

Data Scientist @ Bantrab

Engineered an enterprise Lakehouse platform and deployed a machine learning behavioral scoring model using Scikit-learn to forecast credit default risk, establishing full MLOps CI/CD pipelines.

PySparkScikit-learnAzure PipelinesDocker

Impact & Metrics

34.2%Increase in Data Match RatesVia OOP algorithm optimization
32%Reduction in NPLsWithin 6 months via ML behavioral scoring
28%Avg. Monthly Sales BoostVia decoupled CMS architectures
Q. 46MCapital SavedVia data mining & predictive logistics