Allium makes blockchain data accurate, simple and fast
Blockchain data is hard, messy, and chaotic
When we started out in late 2021 our thesis was simple - blockchain data, despite it being public and free, was difficult to understand, clunky to access and troublesome to maintain. Answering a simple question like "Who are the biggest Ethereum token holders over time?" requires an engineering team to run their own RPC nodes, ingest the full history of the blockchain, clean the data, transform the data and finally summon a wizard to cast a complex SQL query.
Accessing data is hard because blockchains are optimized for Writes and not Reads
Why is it so hard? Blockchains have historically been optimized for Writes (getting data onto the blockchain) and less for Reads (getting data OUT of the blockchain). This is because optimization efforts were focused on increasing transaction throughput and building fault tolerant and scalable consensus algorithms. This neglect makes it hard to get data out efficiently and reliably at scale.
Parsing and interpreting blockchain data requires both deep domain expertise and data manipulation
To quote Tim Roughgarden, Columbia Professor, "Blockchains are (virtual) computers, not databases." They are Turing machines that support general computations, and anyone can write and deploy their own smart contract for their own use case. This nearly infinite number of use cases leads to the fragmentation of data schemas for different purposes. Standardizing these schemas requires deep domain expertise to turn esoteric technical outputs into clear information for specific concepts like tokens, NFTs, stablecoins and DEXs.
Allium abstracts the complexity with a simple way to query blockchain data
Allium tames the chaos by ingesting, sanitizing, and standardizing all this data. As of this post, the data we've archived across 100+ blockchains is in the petabytes and growing exponentially.
Google and Bloomberg had to organize the world's public financial and webpage data, Allium is on a mission to do the same for blockchain data
This is one of the rare times in history where indexing a giant public dataset is sorely needed by all - similar to what Bloomberg did for financial data and what Google organized for public webpage data. With this indexed data, we are fortunate to support trailblazers in this industry and play some role the industry's most exciting trends :
About our customers
We serve 2 groups of customers today with the same data but different platform. Analysts who need to answer data questions about the blockchain (think BI) and Engineers who need highly reliable data queryable in near realtime (think Application backends). Our customers include the biggest institutions Visa, Stripe, Grayscale and also the biggest crypto companies such as Phantom, Uniswap. Allium is one of the unique companies in the industry that bridge blockchain and non blockchain worlds.
About the role
At Allium, Blockchain Data Wizards are the data architects who transform raw, fragmented on-chain data into elegant, meaningful abstractions that power the world's largest crypto applications.
You'll be equal parts detective, data engineer, and protocol researcher-diving deep into blockchain internals, reverse engineering DeFi protocols, and crafting data models that institutions and developers rely on.
What you will do
Transform messy on-chain data into unified, intuitive data models (e.g dex.trades, lending.liquidations) that work across verticals and protocols, providing users with granular, reliable insights.
Dive deep into the mechanics of new DeFi protocols, dissect smart contract interactions and on-chain events, extract the data that matters, and transform it into clean, production-ready models using SQL / dbt.
Dive into a blockchain's raw data structures, ensure complete and accurate data capture, and design schemas that translate native blockchain data into reliable internal models, becoming the trusted source of truth for that ecosystem.
Design and create metrics people can actually use - volumes, fees, TVL, and user activity, and filter out noise like wash trades, bots, and Sybil attacks. Make it easy for the industry to see what's real and what matters.
Qualities
You find peculiarities in data that others miss. You question assumptions, dig deeper when something looks off, and help the industry redefine narratives with evidence-based insights.
You don't just build tables-you design data products. You think about query patterns, performance, documentation, and how end users will interact with your models.
New blockchains and protocols have sparse documentation. You thrive when you need to figure things out yourself through exploration and experimentation.
You can write elegant, performant SQL transformations that process large volumes of data. You understand incremental models, macros, and how to build maintainable data pipelines at scale.
Allium's Data Wizards in Action
Don't take our word for it, what our customers say about us ()
What some ~cool people have to say about us :
Ok.. now for some tough love, here are the values we strive for at Allium :
About the team
We invite people of all backgrounds (). We have engineers who learnt coding much later in life, who learnt coding on the side, we have engineers who are still in school and we also have engineers who went to the top schools (CMU, Stanford, UIUC, UPenn, Oxford, NUS, Cornell), all are welcome if one comes in with a curious mind and an infectious work ethic.
Administrative Benefits
Medical, Dental, Vision, Life and AD&D insurance - US folks get 100% coverage for Gold plans, 80% for dependents
Note : The sun never sets on Allium - we hire from any geographical location as long as you are willing to overlap 2 hours overlap on NYC mornings Mon-Thurs from 10am-12pm ET. We have people based in New York, Seattle, Singapore and Australia
All applicants have to answer this pop quiz : "What is an Allium? What is your favorite Allium?". Bonus points for the right pronunciation.
Data Scientist • New York, NY, United States