API Discovery: Can we do better?

psadauskas · on Nov 21, 2016

It sounds like what you're describing is very similar to Hydra[1], which builds upon JSON-LD[2], which builds upon schema.org.

JSON-LD provides a vocabulary (schema) for linking various documents together, so you can for example link a blog post document directly to its author. Hydra then provides a vocabulary that lets you describe how to fetch that author document, what fields are required to create a new author object, and how to update the blog post to instead link to that new author. In this case, both "author" and "blog post" can be their respective schema.org documents.

[1]: http://www.hydra-cg.com/ [2]: http://json-ld.org/

asbjornu · on Nov 22, 2016

My thought was Hydra when I read the suggestion as well. As Ivan Goncharov mentions JSON-LD in his blog post, I find it a perfect match.

welder · on Nov 21, 2016

> Then you automatically assume it’s a REST API using JSON payloads. Who’s doing anything else nowadays?

I'm seeing people using GraphQL[1] more these days, which also solves the problem of API schema discovery because it's one endpoint with built-in introspection[2].

[1] https://graph.cool/

[2] http://graphql.org/learn/introspection/

IvanGoncharov · on Nov 21, 2016

It was a joke to make article easy to read.

> people using GraphQL[1] more these days

I fully agree. That's why we also maintain a list of GraphQL APIs: https://github.com/APIs-guru/graphql-apis

> solves the problem of API schema discovery because it's one endpoint with built-in introspection

You're absolutely right. We plan to do a few interesting projects around GraphQL. Subscribe to our blog to not miss announcements :)

dmd · on Nov 21, 2016

Here in "enterprise" people are still building SOAP apps, sadly.

oneweekwonder · on Nov 21, 2016

I wonder if the author knows about Swagger specs; They also started the OpenAPI initiative[0].

I now have wrapped a couple of API's in swagger and using swagger-ui to create beautiful interactive api docs for users.

Moving forward I'm going to wrap most of our legacy systems around a swagger spec'ed API and discontinue any alternative access to these systems as time goes by.

[0]: http://swagger.io/introducing-the-open-api-initiative/

IvanGoncharov · on Nov 21, 2016

Hi,

I'm the author, and I definitely know about Swagger/OpenAPI. By coincidence, I'm maintainer of the collection of 250+ Swagger specs for public APIs: https://github.com/APIs-guru/openapi-directory

But I have learned the hard truth over last two years: API catalogs aren't scalable solutions for API discovery. That's why I'm pushing this.

P.S. it is pretty easy to generate Schema.org type based on Swagger/OpenAPI spec.

oneweekwonder · on Nov 21, 2016

> I'm maintainer of the collection of 250+ Swagger specs for public APIs: https://github.com/APIs-guru/openapi-directory

Ah cool, thanks for the link and effort. Will have a look at it as time allow.

> It is pretty easy to generate Schema.org type based on Swagger/OpenAPI spec.

Do you have any example or gist of a snipped of code to accomplish this?

> API catalogs aren't scalable solutions for API discovery. That's why I'm pushing this.

But does the API catalog not enable auto discovery? Agreeing you maintaining a list is not scalable.

stephenhuey · on Nov 21, 2016

In case this helps, the menu for apis.guru doesn't seem to be working on my iPhone 6s in Chrome or Safari. Actually, it just doesn't work on the homepage. It works if I go to the Browse APIs page.

romanhotsiy · on Nov 21, 2016

Thanks! That helped! Fixed now

agentgt · on Nov 21, 2016

> But I have learned the hard truth over last two years: API catalogs aren't scalable solutions for API discovery. That's why I'm pushing this.

Minor critique but the word "discovery" particularly in terms of service oriented is heavily overloaded. It took me a little while to understand what the article meant by discovery and I'm still not entirely sure I do.

Do you mean this discovery: https://en.wikipedia.org/wiki/Service_discovery or do you mean a discovery more in the human sense (ie a consumer search directory listing API services)?

alemhnan · on Nov 21, 2016

Would you mind to share a link for that? I am very interested as I work on different projects with swagger apis.

romanhotsiy · on Nov 21, 2016

Here is the main site: https://apis.guru

And here is github repo: https://github.com/APIs-guru/openapi-directory

alemhnan · on Nov 21, 2016

thanks a lot!

jayd16 · on Nov 21, 2016

Swagger has been great in my personal usage. If google started indexing your swagger spec and made a search widget for it, that would really push it to critical mass.

ewittern · on Nov 21, 2016

We are working on a IBM "catalog" at IBM Research called API Harmony: https://apiharmony-open.mybluemix.net/

One of our goals is to mine information about APIs, rather than relying on user input (hence the "catalog"). We still have a long way to go, though. In Web API land, many things follow a power-law distribution: for a few well-known APIs, a lot of information can be found (https://apiharmony-open.mybluemix.net/apis/instagram). For many other APIs, information is sparse and hard to mine.

Adding Schema.org (or similar) annotations would help. But it may be as hard to convince providers to do this, as it is to convince them to write OpenAPI specifications (or similar).

Senji · on Nov 21, 2016

Having a problem with industry acceptance? Throw the government at it. Either mandate or give tax breaks for following standards that the govt decrees you must meet.

When you can't use the government make a conglomerate of large companies with key hold on industries.

ehnto · on Nov 21, 2016

I put together a side project some time ago that allows you to retrieve and display arbitrary API endpoints, to create a dashboard of realtime data.

One of the bigger challenges was finding appropriate APIs to use. The idea being that it had to make sense as a realtime datapoint. Current temperature in California is one example.

Part of the solution I tried for the end user was to add a searchable database of APIs, but I never found many that had that kind of realtime datapoint. The majority of use cases for APIs seems to be CRUD apps, authentication, unchanging datasets, or really specific once off data transforms.

I feel like that's a bummer, as I really love the idea of being able to build up a dashboard of random data points about the world that interest me. The number of astronauts currently in space, the temperature somewhere I like to visit, the traffic density on specific roads. All retrievable individually from various apps, but not in one scifi intelligence center style place.

For anyone interested, the site was https://apiblocks.com

z4chj · on Nov 21, 2016

The best part about this site is the API database link in which you can find a Ron Swanson quotes API. Thank you for that entertainment

m3h · on Nov 21, 2016

Interesting thoughts. I've always liked if we could have something like an `api.txt` file on a website, just like we have a `sitemap` file.

The `api.txt` can simply be a text-based listing of links to API descriptions for all APIs exposed by that site publicly. Most API description formats already allow for the API name, category, endpoints, etc. so no need to reinvent the wheel there. A good idea would be to add the `<link rel="api" href="api.txt" />` tag on the webpage too. This would make discovery and cataloging much easier.

Some API cataloging sites have come up with interesting ideas. For example, there is a service https://sdks.io/ that can automatically generate SDKs right out of the API descriptions that they have crawled. It is powered by https://apimatic.io/

psadauskas · on Nov 21, 2016

Your thoughts on "/api.txt" are pretty similar to how a Hydra API can be discovered: https://www.hydra-cg.com/spec/latest/core/#discovering-a-hyd...

Essentially, there can be a Link HTTP header to the "Entry-Point" document for the API, which further links to the vocabulary that describes the resources understood by the system and the actions you can perform on them.

    Link: <http://api.example.com/doc/>; rel="http://www.w3.org/ns/hydra/core#apiDocumentation"

I recommend watching some of the talk videos listed on the hydra web page: http://www.hydra-cg.com/

tomc1985 · on Nov 21, 2016

I find it strange that people regard APIs as their own thing. I have never "searched the web" for "email APIs", in fact that whole sentence just sounds amateurish. Similarly what is all this tooling he refers to? WTF is an "API console"? Is that not curl?

One time, while interviewing a candidate, I picked a random buzzword on his resume (which happened to be API) and asked him to talk about it for a bit. He immediately launched into a mini-rant about all these crazy details with APIs that seemed to be foregone conclusions. Do people actually agonize over whether the API speaks XML or JSON? Or fret over the lack of a console? Are we talking about the same thing? You have an endpoint, you send it a request, you get back some data. WOW SO HARD!

psadauskas · on Nov 21, 2016

It is pretty hard. One API, fine, but at several points in my career, I've had to write an "API Integration" layer for an app. For example, send notifications to Slack or Hipchat or GitHub, or poll one of a half-dozen endpoints for new info, or post some data to a half-dozen others.

Usually each service provides a client package for whatever language you're using, but sometimes its from a third-party, and they're always in different levels of maintained-ness. In Ruby, there hasn't been a single overwhelming HTTP client, just several that passed in and out of popularity, so depending on when the initial release of the gems you're using were, you'll end up with 3-6 different HTTP client implementations, each with their own way to mock requests for your tests.

On top of the client libraries, you have to spend a not-insignificant amount of time reading the service's documentation, to see how they prefer you do auth, what methods to call on the library, what the payload needs to look like, what response codes to expect, what exceptions may be raised... It is a non-trivial amount of work for each additional service. And then you get to stay on top of API updates for the dozen services you integrate with, and hope they have an email list or RSS feed so that you have some warning before they completely break your integration.

The promise of schema.org (and JSONLD, and Hydra, as I mentioned in another comment), is that if we could all just stop NIH every damn API we write because "those look too complicated, how hard can it be to just return some JSON?", then people like me, who need to integrate with several APIs, wouldn't have to waste so much time on figuring out how your special snowflake actually works.

tomc1985 · on Nov 21, 2016

I've had to do this too, but the difficulties you describe are par for course with any API, web or not. People have been dealing with this shit for ages, I really don't think the HN crowd has very much to add.

OJFord · on Nov 21, 2016

    > I find it strange that people regard APIs as their own
    > thing. I have never "searched the web" for "email APIs",
    > in fact that whole sentence just sounds amateurish.

What do you do if you know that you need an API for a thing, but you don't know who offers that, or who a good one is? Nothing wrong with searching for "email APIs", "payments API", "maps API", etc. - I don't know why you think that's "amateurish".

    > WTF is an "API console"? Is that not curl?

Sure, but maybe you prefer a GUI where after `GET /` you can click on 'foo' to do `GET /foo`; maybe you find that quicker and easier than using curl.

    > Do people actually agonize over whether the API speaks
    > XML or JSON?

I'm not going to throw a job interview over it, but personally I'm much more familiar with JSON and know how to handle it, so ceteris paribus, yes, I'd pick the JSON API. I'm sure others feel similarly about XML.

tomc1985 · on Nov 21, 2016

But it's all just data, is it not? I was of the opinion that any halfway-decent practitioner of computing has at least half-a-dozen different ways of getting at data or prying open its contents. Nearly every major programming environment -- particularly the ones in play with the HN crowd -- has more-than-adequate facilities for handing data in JSON, XML, and other formats. The interface to the API is usually a matter of sending the right command string over the wire, maybe you have to figure out some authentication ahead of time but most-to-all of this stuff is supposed to be write-once-run-forever. Are there additional steps in the workflow or something?

Part of the reason it is weird to see people talking about "APIs" now is because API is such a core, fundamental concept in programming that to use the term in its present context (as in, a web API which you issue commands via HTTP) does a disservice to all the other APIs out there. When people on HN talk about API considerations they seem to omit the 95% of other use cases there are for APIs outside of the realm of startups and webapps.

For example, bash could be considered an API to the Linux System Base, and bash commands use Linux's syscall API to achieve the user's ends. Syscalls are issued commands over an ABI usually handled by libc, and return data in the format prescribed by the function signature.

My TV presents an API that is accessible over USB. Granted, I am pretty sure that port is only for use by service techs, but if I wanted to I could probably hook up some electronics and figure out the programming interface. Commands are issued through the USB bus and data will (hopefully) come back in some format.

My MIDI keyboard offers an API through buttons, a little LCD panel, and MIDI-over-USB. I issue it commands through button presses on its control panel, or through MIDI or OSC protocols, and it also returns data (either a human-readable response on a LCD screen or more data over the USB wire). In both those cases I'll have to decode that data to figure out whether my command was accepted and/or what to do next.

Finally, a popular web-based CRM solution offers an event-lifecycle and data-tracking API that I have to interface with at work. I issue it commands over HTTPS and it responds in XML-over-HTTP, which like JSON, is fairly easy to parse in any number of different ways, and in any case its responses are only ever in XML so once that was written it was done with.

This is why this "API" dialogue is so strange. It is reductive and if overarching solutions are going to be prescribed than they need to take into account the wider and entire world that concept lives in. I don't see how many of the prescriptions offered help the use cases outside the HN crowd -- in fact I would argue that the author needs to go back and comb existing literature outside of web programming, because the engineering and computer science worlds have been dealing with exactly these problems for decades.

The "APIs" talked about here refer to a very specific type of product for a very specific audience; there is a bigger world out there and people seem to forget that.

  > What do you do if you know that you need an API for a thing, but you don't know who offers that, or who a good one is? Nothing wrong with searching for "email APIs", "payments API", "maps API", etc. - I don't know why you think that's "amateurish".

Search for "[product name] API doc/faq/whatever" or search for code examples of people interfacing with it? It's amateurish because it reflects a clumsy and incomplete understanding of the concept of "API"

OJFord · on Nov 22, 2016

I agree. I suppose it's just a matter of context, and at least in the web API context it's almost invariably taken to mean "open, accessible, and documented".

OP is about making it even easier to access that documentation.

    > Search for "[product name] API doc/faq/whatever" or search for code examples of people interfacing with it?

That's fine if you know the `[product name]`. The "email APIs" example in OP was about discovery of those product names - who offers open web APIs for email; let me compare them.

dozzie · on Nov 21, 2016

It actually can be so hard. People who call these services that mainly expose ad-hoc defined RPC interface "APIs" tend to overcomplicate their code. You get plethora of endpoint URLs, GET parameters, methods, HTTP headers, and queries encoded in JSON or XML or GraphQL or whatever sent in request body. It Ain't No Simple Thing[tm] with this many dimensions.

tomc1985 · on Nov 21, 2016

Maybe, but that's a problem with the design of the program, and the contents of that response. An API is only the button board, not its layout or the colors of each button. That's up to the program designer.

orliesaurus · on Nov 21, 2016

I am pretty sure there are a few API marketplaces out there where its easy to see pricing and code all at once, without having to browse around like a mad person, just like amazon has done it for shopping, with reviews, ratings and q&a.

zackmorris · on Nov 21, 2016

While a lot of smart people are looking at this, I wanted to ask: does anyone know of any viable projects that build semantic data from existing websites?

I have experience with Open Graph http://ogp.me/ and oEmbed http://oembed.com/ but something about humans having to manually categorize a site has never sat right with me. We should be able to grok structure from sites by how other sites refer to them, similar to what Google does. This may be related to open source search, I don't know.

sapeien · on Nov 21, 2016

This looks really great for discoverability that works within the web platform already, without the need for inventing a new format [0].

However, I'm not sure if this goes far enough. There exist some formats for describing what APIs can do in natural language, though I'm not convinced that they're really that useful [1][2].

[0] http://apisjson.org [1] http://alps.io [2] http://restdesc.org

BoppreH · on Nov 21, 2016

What about https://schema.org/APIReference ? Doesn't seem to be widely used, but exists.

sapeien · on Nov 21, 2016

I think HTTP-based APIs are closer to a service than an library with a public interface, even though the term API can be used interchangeably for both: https://schema.org/Service

ksavenkov · on Nov 21, 2016

Google supports around seven Schema.org entity types (https://developers.google.com/search/docs/guides/mark-up-con...). I think they look at the user segments and people looking for APIs is not their main audience. Especially nowadays when AI and data science is much cooler than just software engineering ;-)

allthingsapi · on Nov 21, 2016

We discussed a similar problem with regard to the availability of swaggers and providers willing to do so https://news.ycombinator.com/item?id=11583377

detaro · on Nov 21, 2016

Semi-related question: is there a list of things that actually use Schema.org, outside of Google Search for some things?

Meic · on Nov 21, 2016

Schema.org is a joint effort between Google, Microsoft, Yahoo! and Yandex (all search engines) so that is going to be the primary use-case. But their FAQ section says other non-search-engine organisations may join later.

asbjornu · on Nov 22, 2016

JSON-LD[1], a way to express hyperlinks (linked data) in JSON documents, strongly encourages the use of shared vocabularies instead of every API being their own snowflake with their own private vocabulary every developer need to invest a lot of time into understanding. Schema.org is one such vocabulary.

Not having to read documentation to understand whether the "table" property in the JSON document means "coffee table" or "tabular data" is going to save developers worldwide umpteen man-years. Still, most API developers continue creating snowflakes with proprietary designs, protocols and vocabularies.

JSON-LD is a large step in the right direction by making it easy to share vocabularies. Hydra[2] takes the next logical step by standardizing the protocol as well, so no matter what your API does, it has one standardized way to express it.

[1]: http://json-ld.org/ [2]: http://www.hydra-cg.com/

virmundi · on Nov 21, 2016

Many recipe websites use it. Allrecipes.com has it buried in the tags. One interesting thing about it is the inconsistent application of fields. Units of measures are sometimes cracked out into their own fields. Other sides clump them together. To fix this, we need to use a bit of NLP.

miguelrochefort · on Nov 21, 2016

This is stupid. The solution is the semantic web.

romanhotsiy · on Nov 21, 2016

The funniest part here is that the proposed solution actually IS the semantic WEB. Schema.org is based on semantic web technologies. Check out wikipedia example: https://en.wikipedia.org/wiki/Semantic_Web#Example and note they use Schema.org as vocabulary

sctb · on Nov 21, 2016

This doesn't count as a civil and substantive comment on HN, which we've asked you for multiple times. Please stop or we have to ban the account.

angersock · on Nov 21, 2016

> But why does searching for APIs and restaurants differ so much?

Because--oh I don't know--APIs aren't restaurants?

Is this really a problem, or is this a tool to help lazy blub developers be even lazier and blubbier?

If you care enough about an API to worry about what things it does via the docs--and why aren't you using a client lib anyways?--then you care too much to be satisfied with a simple sketch of the API via Google search results.