How can I get Wikipedia content using Wikipedia's API?

I want to get the first paragraph of a Wikipedia article.

What is the API query to do so?


See this section in the MediaWiki documentation.

These are the key parameters.

prop=revisions&rvprop=content&rvsection=0

rvsection = 0 specifies to only return the lead section.

See this example.

http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&rvsection=0&titles=pizza

To get the HTML, you can use similarly use action=parse http://en.wikipedia.org/w/api.php?action=parse&section=0&prop=text&page=pizza

Note that you'll have to strip out any templates or infoboxes.


See Is there a Wikipedia API just for retrieve the content summary? for other proposed solutions. Here is one that I suggested:

There is actually a very nice prop called extracts that can be used with queries designed specifically for this purpose. Extracts allow you to get article extracts (truncated article text). There is a parameter called exintro that can be used to retrieve the text in the zeroth section (no additional assets like images or infoboxes). You can also retrieve extracts with finer granularity such as by a certain number of characters (exchars) or by a certain number of sentences(exsentences)

Here is a sample query http://en.wikipedia.org/w/api.php?action=query&prop=extracts&format=json&exintro=&titles=Stack%20Overflow and the API sandbox http://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&prop=extracts&format=json&exintro=&titles=Stack%20Overflow to experiment more with this query.

Please note that if you want the first paragraph specifically you still need to get the first tag. However in this API call there are no additional assets like images to parse. If you are satisfied with this introduction summary you can retrieve the text by running a function like PHP's strip_tag that remove the HTML tags.