hckrnws
This is really nice, specially the pdf report generation.
I feel very moronic making a dashboard for any products now. Enterprise customers prefer you integrate into their ERPs anyway.
I think we lost the plot as an industry, I've always advocated for having a read only database connection to be available for your customers to make their own visualisations. This should've been the standard 10 years ago and it's case is only stronger in this age of LLMs.
We get so involved with our products we forget that our customers are humans too. Nobody wants another account to manage or remember. Analytics and alerts should be push based, configurable reports should get auto generated and sent to your inbox, alerts should be pushed via notifications or emails and customers should have an option to build their own dashboard with something like this.
Sane defaults make sense but location matters just as much.
> I've always advocated for having a read only database connection to be available for your customers to make their own visualisations.
Roughly three decades ago, that *was* the norm. One of the more popular tools for achieving that was Crystal Reports[1].
In the late 90s, it was almost routine for software vendors to bundle Crystal Reports with their software (very similar to how the MSSQL installer gets invoked by products), then configure an ODBC data source which connected to the appropriate database.
In my opinion, the primary stumbling block of this approach was the lack of a shared SQL query repository. So if you weren’t intimately aware with the data model you wanted to work with, you’d lose hours trying to figure it out on your own or rely on your colleagues sharing it via sneakernet or email.
Crystal Reports has since been acquired by SAP, and I haven’t touched it since the early ‘00s so I don’t know what it looks or functions like today.
My best friend from early uni days did a co-op with Crystal Services, and he's been with them for their entire history through Seagate Software, Crystal Decisions, BusinessObjects (and relocating from Canada to France) and then SAP. I myself have had 2 temporary retirements, at least 4 different careers and countless jobs in that time, and it's wild to know someone who has the same internal drive but has satisfied it with a much more linear path (though you could definitely argue he's seen just as much change as me). From employee ~50 to ~100,050!
Aaaaaah I had a professor rave about Crystal Reports once. Didn't know it had such gilded history.
> I think we lost the plot as an industry
I get your point, but generally with most enterprise-scale apps you really don’t want your transactional DB doubling as your data warehouse. The “push-based” operation should be limited to moving data from your tx environment to your analytical one.
Of course, if the “analytics” are limited to simple static reports, then a data warehouse is overkill.
100% agreed regarding shipping a read-replica, for any sufficiently complex enterprise app (ERP, CRM, accounting, etc.).
Customers need it to build custom reports, archive data into a warehouse, drive downstream systems (notifications, audits, compliance), and answer edge-case questions you didn’t anticipate.
Because of that, I generally prefer these patterns over a half-baked built-in analytics UI or an opinionated REST API:
Provide a read replica or CDC stream. Let sophisticated customers handle authz, modelling, and queries themselves. This gets harder with multi-tenant DBs.
Optionally offer a hosted Data API, using something like -- PostgREST / Hasura / Microsoft DAB. You handle permissions and safety, but stay largely un-opinionated about access patterns.
Any built-in metrics or analytics layer will always miss edge cases.
With AI agents becoming first-class consumers of enterprise data, direct read access is going to be non-negotiable.
Also, I predict the days of charging customers to access their own goddamn data, behind rate-limited + metered REST APIs are behind us.
I fully agree in spirit, but in practice, read-replica's have some edge cases that are hard to control for. Namely, the incentives aren't fully aligned between the database host and consumer, and that dynamic can lead to some difficult resourcing decisions for the DB host. Whereas an API can be rate limited or underlying API queries can be optimized (however frustrating that might be for consumers).
The CDC stream option you flagged is more viable in my (admittedly biased) opinion. At my company (Prequel) our entire pitch is basically "you should give your customer's a live replica of their data in whatever data platform they want it in" (and let us handle the cross-platform compatibility & multi-tenant DB challenges).
I think this problem could also be a killer use case for Open Table Formats, where the read-replica architecture can be mirrored but the cost of reader compute can be assumed by the data consumer.
To your point, this is only going to be more important with what will likely be a dramatic increase in AI agent data consumption.
1999-2000, the company I worked with gave a smallish number of key users full read rights to the SAP minus HR, briefly after introducing SAP to the global supply chain of that company. The key users came from all orgs using SAP, basically every department had one or two key users.
I was part of this and "saw the light". We had such a great visibility into all the processes, it was unreal. It tremendously sped-up cross-org initiatives.
Today, I guess, only agents get that privilege.
hi, dev building Shaper here. I agree re sending reports vs dashboards. Many users use Shaper mostly as UI to filter data and then download a pdf, png or csv file to use elsewhere. We are also currently working on functionality to send out those files directly as messages using Shaper's task feature.
It would be a game changer, very interesting to see this grow. How did you get your PDF generation so good?
happy to hear that! pdfs are generated in a headless chrome in the same docker container as shaper itself using chromedp.
> I've always advocated for having a read only database connection to be available for your customers to make their own visualisations.
A layer on top of the database to account for auth/etc. would be necessary anyways. Could be achieved to some degree with views, but I'd prefer an approach where you choose the publicly available data explicitly.
GraphQL almost delivered on that dream. Something more opinionated would've been much better, though.
That's exactly what I meant. It's a specific replica instance with it's own security etc. but not necessarily a separate API you try to integrate too. APIs can stay for writes, but for reads you have the db
Customers don’t want to learn your schema or deal with your clever optimizations either. If you expose a DB make sure you abstract everything away in a view and treat it like a versioned API.
The best example for this are iot devices that share their data. Instead of reinventing the wheel for a dashboard for each customer just give them some docs and restricted access via a replica.
In what extent this is a metabase alternative? I'm a heavy Metabase user and there's nothing to compare really in this product.
We've (https://www.definite.app/) replaced quite a few metabase accounts now and we have a built-in lakehouse using duckdb + ducklake, so I feel comfortable calling us a "duckdb-based metabase alternative".
When I see the title here, I think "BI with an embedded database", which is what we're building at Definite. A lot of people want dashboards / AI analysis without buying Snowflake, Fivetran, BI and stitching them all together.
hi, dev building Shaper here. Both, Shaper and Metabase, can be used to build dashboards for business intelligence functionality and embedded analytics. But the use cases are different: Metabase is feature-rich and has lots of functionality for self-serve that allows non-technical users to easily build their own dashboards and drill down as they please. With Shaper you define everything as code in SQL. It's much more minimal in terms of what you can configure, but if you like the SQL-based approach it can be pretty productive to treat dashboards as code.
sorry, so it ain't an alternative in any way. Its like saying a bicycle is an alternative to an airplane, both have seats...
Nice work! I met Jorin a couple years ago at a tech meetup and this was just an idea at the time. So cool to see the consistent progress and updates and to see this come across HN.
Is there anyway to run the query -> report generation standalone in process? Like maybe just outputting the html (or using the React components in a project).
I was looking to add similar report generation to a vscode-extension I've been building[0]
Thanks for the cool tool! I think it's worth mentioning SQLPage, which is another tool in similar vein, to generate UI from SQL. From my POV:
- SQLPage: more on UI building; doesn't use DuckDB
- Shaper: more on analytics/dashboard focused with PDF generation and stuff; uses DuckDB
Metabase works great with DuckDB as well, thanks to metabase_duckdb_driver by MotherDuck.
This is so cool and also MPL licensed! Thanks!
As someone who used duckdb but not shaper, what is shaper used for? The readme is scarce on details.
hi, dev building shaper here. shaper allows you to visualize data and build dashboards just by writing sql. the sql runs in duckdb so you can use all duckdb features. its for when you are looking for a minimal tool that allows you to just work in code. you can use shaper to build dashboards that you share internally or also for customer-facing dashboards you want to embed into another application.
Will it expose a visual query builder as metabase?
shaper leans into doing everything as code. instead of using a custom UI you can use your own editor and AI agent to generate dashboards for you. shaper is for people happy to use code. it doesn't try to provide self-serve functionality.
And I think that's exactly what makes it so clever. Three years ago, I would have considered this decision risky. But with the live sync feature and "just SQL" as the language for the dashboard builder, it's so powerful—thanks to Claude Code, for example!
my company integrated tale shape as our customer-facing metabase dashboard alternative. absolutely love its simplicity!
interesting i am trying to build one too but rejected duckdb because of large size, i guess i will have to give in and use it at some point of time.
I wanted to love DuckDB but it was so crashy I had to give up.
I use it daily and it never crashed. How long ago was this? I am a big fan of DuckDB. Plow through hundrets of GB of logs on a 5 year old linux laptop - no problem.
Same here. I have however seen a few out of memory cases in the past when given large input files.
it's not the focus or very performant but you can have it spill to disk if you run out of memory. I wouldn't suggest building a solution based on this approach though; the sweet-spot is memory-constrained.
Really? How large? I’ve only managed to crash it with hundreds/thousands of files so far, but haven’t so many huge files to deal with.
[dead]
Crafted by Rajat
Source Code