hckrnws
Show HN: Open-source BI and analytics for engineers
by louisjoejordan
We are building Quary (https://quary.dev), an engineer-first BI/analytics product. You can find our repo at https://github.com/quarylabs/quary and our website at https://www.quary.dev/. There’s a demo video here: https://www.youtube.com/watch?v=o3hO65_lkGU
As engineers who have worked on data at startups and Amazon, we were frustrated by self-serve BI tools. They seemed dumbed down and they always required us to abandon our local dev tools we know and love (e.g. copilot, git). For us and for everyone we speak to, they end up being a mess.
Based on this, we decided there was a need for engineer-oriented BI and analytics software.
Quary solves these pain points by bringing standard software practices (version control, testing, refactoring, ci/cd, open-source, etc.) to the BI and analytics workflow.
We integrate with many databases, but we’re showcasing our slick Supabase integration, because it: (1) keeps your data safe by running on your machine without data flowing through our servers; and (2) enables you to quickly build an analytics layer on top of your Supabase Postgres instances. Check out our Supabase guide: https://www.quary.dev/docs/quickstart-supabase
What we’re launching today is open source under the Apache 2.0 license. We plan to keep the developer core open source and add paid features like a web platform to easily share data models (per-seat pricing), and an orchestration engine to materialize your data models.
Please try Quary at https://quary.dev and let us know what you think! We're excited to put the power of BI and analytics into the hands of engineers.
Congrats on the launch!
I've built analytics products, and the good thing about dashboards is that there's budget for them. People like eye-candy, and are willing to pay for it. I like how you picked Postgres as your initial database, because I think it's still the #1 databases for analyics (even though it's OLTP) that no one talks about.
The three products where I think you may want to write short comparison pages are:
- Rill - Preset - Metabase
And I'd take a hard look at ClickHouse as your next database. They're missing a dashboard partner. And I think they're users are much more engineering-centric and therefore a good fit for you than the analytics crowd around Snowflake.
Appreciate your feedback and guidance.
I was just at the Click-house office a few weeks ago - this is a really good idea.
Side comment: what an interesting landing page it has. That Slack CAT button right within the fold is a good idea. A walkthrough and a way to schedule a meeting with the founders. This is very straightforward. Good luck!
Hey! OP here. This made my day, thank you!
Congrats on the launch!
I think here's a few players in this space (dev-friendly BI tool) already: - Holistics.io - Lightdash - Hashboard
These tools all allow analysts to use both/either a local/cloud IDE to write analytics logic, and check in to Git version control.
How do you plan to differentiate with them?
All these comments ask for comparisons. It might be worth creating some alternative pages like podia do [1]. It could be helpful for your growth.
Seems like a cool project!
Hey! OP here. This is really good feedback thank you.
Very cool!
Do you anticipate going more towards improving the data modelling capabilities (take on dbt et al) or more towards Business Intelligence (dashboards then hosting then drag&drop query builder all the way until the dreaded pdf export)
Something that is overlooked in the dbt direction is how complex data models get. BI nothing seems overlooked, it is just hard!
I like that you have a clear anti-ICP [dbt customers, analysts]. This keeps you clear of the BI/DWH space. I do wonder how you avoid getting stuck in the BI tar pit [], or avoid getting stuck in the dbt middleware zone. Maybe with a core focus on engineers getting further and further without needing a BI/data team!
[]https://twitter.com/generick_ez/status/1782844341674786952
Hitting us with the hard questions! And honestly this is something people overlook about the realities of the data space.
It's always easier to communicate what you are not, so let's begin there:
- drag and drop query builder - absolute prettiest graphs - tailored to the least sophisticated user
In addition to what I think we are not, I think there is some space for our belief about what the data space is/is not:
- We don't think data self-serve is possible but rather small datasets can be tailored. Fundamentally it comes down to complexity and data is complex. It takes expertise/skill to get value from data.
- It takes experts to get value from data, but as systems get better it will take fewer.
- Businesses should not be data driven, they should be reason driven.
- We don't think data dominates business, it's a supporting tool and so we don't think it will every the entry point for everything but rather supports process, so we want to appear in places where we can support that reasoning, like a chart in a Notion doc.
Now a few bits about what we are:
- Tool for those experts and engineers
- Tool to make them the most productive ever
- Prevent messes that people get themselves in in BI by injecting software engineering practices into the process (we know many companies with full time employees responsible for cleaning up messes)
> Now a few bits about what we are:
These are great. I think a lot about BI tools, seldom find any that I would use!
We're all on on https://www.sigmacomputing.com/ bc we don't like hosting/managing/provisioning essential tools like this + this seems more complicated to configured.
I would recommend a simpler setup like Metabase Docker (which I re-evaluated recently): https://www.metabase.com/docs/latest/installation-and-operat...
Appreciate the feedback! We'll keep this in mind.
There is nothing to host/provision, so it's simple in that sense. You just run it locally with your credentials and connect directly to your database.
It is definitely not the easiest to set up especially when thinking as a team so we'll keep that in mind.
Seems a lit samiliar to redash, writing sql to build dashboards. or using pygwalker + streamlit for more customization. https://docs.kanaries.net/pygwalker
I'll ask another of the "how is this different" questions - how is this different from https://evidence.dev/ ? Quary seems a little like dbt + Evidence from what I can see.
We're big fans of the Evidence team. While there's some overlap, we have a heavier focus on data modelling (similar to dbt). The key difference is we've rebuilt the modelling layer in Rust, leveraging WASM for better performance and browser-based execution. This lets us build a more seamless, end-to-end workflow encompassing transformation + viz optimised for the web.
Thanks for the response!
Tried going through the onboarding sample project from within VS-Code locally… I know, you suggest trying it in the Github Browser, but, hey, I'm perverse and it's available as an option within the extension.
It's not at all clear from the documentation or the onboarding notes how to seed a SQLite in-memory database and the CSVs in the `seeds` directory are sometimes referred to in the sample schemas, but sometimes not. So, kinda got stuck.
I know if I stuck with it (I got impatient), I'd figure it out myself, but it does seem to be a missing element in the docs.
Looks fascinating, though.
Kinda like Elixir LiveBook, but focussed on DBs.
Congrats on the launch!
I've been evaluating evidence and observable framework for a while, and this seems like a nice addition as alternative
But I just realized you require login when using vs code, what is it used for? And can I completely self host this?
Thans!
Hey! We killed the auth flow from our extension, we used it to get an idea of how many people are using it. The extension works entirely local and connects to your database through your machine. So there's no need to self-host anything!
We are looking at moving our Power BI stuff to Apache Superset [1]. How does this compare to Superset?
Superset is a beautiful tool focused on self-serve with amazing visualizations. I won't take anything away from them!
Our thesis is that self-serve is much less important than people think, and we find people often make a mess of never-ending dashboards. Current BI tools struggle to prevent that. We solve this problem with a core of software engineering practices.
If you're targeting use within software and engineering teams, that thesis may be right. If you're targeting adoption across whole businesses, I think the thesis is pretty wrong and will end up hampering adoption. To broadly bucket BI challenges, there's first the challenge of getting people to use the thing, then the challenges that come when everyone is using the thing. Tech types seem to underrate the challenge of getting people to even use a BI tool in the first place.
I've found self serve to be a really effective tool in getting engagement with BI. My onboarding for new non-tech BI users was always to have them build a basic dashboard for the business process they were most focused on. Maybe set an alert or create a scheduled report delivery. By the end of a 15 or 30 minute onboarding session you'd see the click as they realized what they could do with it.
That mess of never ending dashboards has another name: BI engagement. Though a product can help, having core dashboards and KPIs is a social and analytics leadership problem and not a technical one.
Though I have issues with Looker (their dev experience is crappy), their approach to this is effective: make it difficult for self-serve users to get incorrect or nonsense answers, and make it easy for analytics admins to designate core dashboards and jockey a few hundred custom dashboards and reports as the underlying data models change. Every business unit got pretty attached to what they'd built for themselves.
You're spot on that BI adoption is largely a social challenge. Our thesis is that by defining the entire journey from source to viz as code, we create a structured foundation that LLMs can build upon, democratizing access to the transformation layer for non-engineers in a way that point-and-click BI tools can't.
Can you please elaborate on how you see LLMs could build upon this model/journey?
Llms would generate the code/definitions underlying these dashboards, presumably a model could be trained for the task. I'll argue it trades one version of the sprawl problem for another. Unless this generated code is easy to debugs and comprehends other generated code, it will still be a spaghetti mess at scale.
Looks pretty exciting, congrats. For looking at the intro video and skimming through the documentation, I think I mostly understood what it does and how it works. What I don't understand is the endpoint: can I show the dashboards to an end-user? Does it builds a website, or its usage is limited inside VS Code?
We've been focusing on the core VS Code extension and haven't released sharing yet. The plan is to provide a Vercel-like experience for deploying and sharing graphs.
People will be able to connect their GitHub repositories, deploy dashboards, and share them via our website. The interface will allow switching between branches and time-traveling between different states of the dashboard.
Here's a preview: https://www.youtube.com/watch?v=MD6In-iUd9g
Seems similar to plotly dash, no?
The biggest difference I see (though I'm not super familiar with Plotly) is that we define data transformations in SQL, while Plotly uses Python. One benefit of SQL is that it provides the advantage of tracing data lineage from source to visualization, which gives you visibility into data dependencies - something that Python code in Plotly Dash doesn't offer.
It's unfortunate that org-mode is not more wide-spread (linked to Emacs). Org-mode covers this and a million other use-cases. Don't get me wrong though, this looks really good. So, congrats to OP :)
Appreciate the kind feedback! Curious to know if org-mode is still actively maintained.
i tried org mode for sql queries but than went back to sql mode because lsp is not supported in org mode. Also how do you use charts with it?
From an external look, that sounds a lot like what dbt is meant to be. Why would one choose quary over dbt?
Hey, OP here. We love what dbt has done for transformation-layer engineering. But we often see companies still struggling with a mess of unstructured dashboards, even with solid dbt models underneath.
The problem is that dbt models and BI dashboards are often managed by separate teams. Quary brings the two together, letting engineers define reusable models and build well-structured dashboards on top of them in one cohesive, code-first environment.
I think it finally occurred to me that you care only to transform data insofar as it is for the purpose of being used in BI/dashboards and not for data warehouse purposes. That wasn't clear to me at first but it makes sense.
While that's somewhat true, our CLI can push transformations back to your warehouse. We and some of our customer use Quary for their "data warehouse purposes" also. We think the integrated flow makes the E2E experience very quick.
So it’s Looker and LookML.
Does it support datasource merges like redash do? I had hard time looking for simple solution where I could easily join data from multiple sources and provide simple charts from engineering to support teams.
We do if you use DuckDB and you pull data from your data sources through DuckDB. DuckDB can act as a single interface between multiple data source types. Feel free to DM me with any more questions. around your specific use-case and I can help.
This would make a good blog tutorial, I think.
I think so too, will put this as a to-do.
Sounds interesting, I'll give it a try.
Great! Feel free to reach out to me with any questions.
How is this different from Lightdash? https://github.com/lightdash/lightdash
Big fans of our fellow YC mates at Lightdash!
There are some core differences that make our product feel quite different:
- Lightdash isn't Lightdash without dbt so you always have that divide even though they have done a fab job of minimizing it.
- The editor for us is in Visual Studio Code which means you don't have that jump and can iterate all together.
- Every thing is version controlled as a file in your repository which means you can add those engineering practices to the dashboards/charts themselves.
What do you mean by "Lightdash isn't Lightdash without dbt"?
Needs dbt to function
I’m not sure how to think of this, is it an engineer first version of Metabase?
That's useful feedback for us! We can improve our messaging.
The simple answer is yes: Our focus is code first, from modeling to charts and dashboards, and not self-serve.
We often found that keeping BI applications/dashboards organized is very difficult so we're adding engineering practices.
How is different from Grafana?
Ben here from Quary.
We love Grafana! It's fab for building dashboards, but it's focused on dashboarding/alerts and on pulling from various data sources, not just SQL.
Quary is purely focused on SQL, and crucially, it allows you to build up and develop more complex transformations.
Just out of curiosity, what was the reason for the MIT -> Apache 2 move? https://github.com/quarylabs/quary/commit/db7a42a58ce66df13f...
Hey, Ben here from Quary; very valid comments like the one below copied meant we rethought our strategy it a little. We want to be open source but think we need a little protection.
"Hate to derail the conversation, but is Quary something I could easily whitelabel to embed BI into my product for my customers? (Passively) looking for solutions in that that don't feel dumbed down."
You mean protection as in protection from intellectual property (patent) lawsuits?
Yep, I meant protection in terms of intellectual property.
Wondering if the signing in is mandatory to use it?
No sign in needed!
Thx for clearing that up, it was not so obvious from the https://www.quary.dev/docs/sample-project#signing-in-to-quar... section.
I appreciate the use of Tailwind scroll-margin on your anchors btw, caring for details is communicative ;)
See also Eclipse BIRT ... https://en.wikipedia.org/wiki/BIRT_Project . It seems to have languished for a while but it's active once again based on updates to this Stack Overflow posting: https://stackoverflow.com/questions/53362448/development-sta....
Great to see BIRT mentioned on HN. I use BIRT to generate PDFs for clients. Modern BI tools are about interactivity and real time but PDFs still have a role in BI and BIRT does the job. As it uses JDBC to connect to data sources you can connect to most data sources. For many tools these days one of the first things you have to check is which data sources does it connect to. If you use a less popular database chances are your database will not be supported. I have worked in organisations that use DB2, Sybase, Oracle and so on and these tend not to be supported by modern BI tools. PDF generation also seems to be a snapshot of the page. So yes BIRT is a great tool, old school and a bit clunky but it does the job.
This is awesome! Great to see this project still alive after so many years :)
Resembles Redash.
Hey! We love Redash too. Where Quary is different is that we have more of an emphasis on Transformation. This means people can split out complex SQL blocks into modular, reusable components which improves data lineage (how the data flows from table to visualisation).
Dbt makes transformations modular and easier. It applies software development methods to the T of ELT.
How does it differ from OpenDashboard?
From what I can see, OpenDashboard is tackling workflow automation tasks. We're more focused on the data modelling process.
Great product!
How does it quary compare to rill?
Hey, great question ... Again another tool we love. A few key differences:
- Visual studio code as the editor through and through
- Dashboards are fully defined in code Quary which is different to Rill
- At its core our architecture is also very different, Rill is built on top of Duckdb for that interactivity which can call out to other databases whereas we can call other SQL databases without everything going through DuckDB.
this looks awesome!
Hey! Thanks so much, really appreciate the feedback
thanks!
Curious what it might take to add AWS Athena as another back end?
Someone else added this as an issue! We're happy to take a look at it. To gage interest please upvote it :)
Hate to derail the conversation, but is Quary something I could easily whitelabel to embed BI into my product for my customers? (Passively) looking for solutions in that that don’t feel dumbed down.
Hey! OP here, I don't have a clear answer for this yet. We're exploring ways to make Quary more extensible. We are focusing on the core piece first, happy to chat to hear more about your specific use-case.
Comment was deleted :(
[dead]
Crafted by Rajat
Source Code