Hi Jan,
the goal of the databus is to produce a replication/deployment infrastructure. At the moment, the self-deployment is implemented already (but not overly documented). There are two ways to set up your own sparql endpoint:
I can understand your request, that you first want to understand the data better before loading it. At the moment, we have a preview (on the page you can fold open the > to see the first 10 lines). There is also a property called dataid:nonEmptyLines "140614"^^xsd:decimal ; but it is still broken, i.e. the dataset has almost 4GB and therefore probably more than 140k lines.
We are currently implementing a triple store that keeps an analysis of all files on the bus, including VOID (https://www.w3.org/TR/void/). VOID has void:distinctSubjects which is what you are looking for. This will need 2-4 weeks (maybe more) to be effective.