How to read a text file directly from Internet using Java?

I am trying to read some words from an online text file.

I tried doing something like this

File file = new File("http://www.puzzlers.org/pub/wordlists/pocket.txt");
Scanner scan = new Scanner(file);

but it didn't work, I am getting

http://www.puzzlers.org/pub/wordlists/pocket.txt 

as the output and I just want to get all the words.

I know they taught me this back in the day but I don't remember exactly how to do it now, any help is greatly appreciated.


Solution 1:

Use an URL instead of File for any access that is not on your local computer.

URL url = new URL("http://www.puzzlers.org/pub/wordlists/pocket.txt");
Scanner s = new Scanner(url.openStream());

Actually, URL is even more generally useful, also for local access (use a file: URL), jar files, and about everything that one can retrieve somehow.

The way above interprets the file in your platforms default encoding. If you want to use the encoding indicated by the server instead, you have to use a URLConnection and parse it's content type, like indicated in the answers to this question.


About your Error, make sure your file compiles without any errors - you need to handle the exceptions. Click the red messages given by your IDE, it should show you a recommendation how to fix it. Do not start a program which does not compile (even if the IDE allows this).

Here with some sample exception-handling:

try {
   URL url = new URL("http://www.puzzlers.org/pub/wordlists/pocket.txt");
   Scanner s = new Scanner(url.openStream());
   // read from your scanner
}
catch(IOException ex) {
   // there was some connection problem, or the file did not exist on the server,
   // or your URL was not in the right format.
   // think about what to do now, and put it here.
   ex.printStackTrace(); // for now, simply output it.
}

Solution 2:

try something like this

 URL u = new URL("http://www.puzzlers.org/pub/wordlists/pocket.txt");
 InputStream in = u.openStream();

Then use it as any plain old input stream

Solution 3:

What really worked to me: (source: oracle documentation "reading url")

 import java.net.*;
 import java.io.*;

 public class UrlTextfile {
public static void main(String[] args) throws Exception {

    URL oracle = new URL("http://yoursite.com/yourfile.txt");
    BufferedReader in = new BufferedReader(
    new InputStreamReader(oracle.openStream()));

    String inputLine;
    while ((inputLine = in.readLine()) != null)
        System.out.println(inputLine);
    in.close();
}
 }

Solution 4:

Using Apache Commons IO:

import org.apache.commons.io.IOUtils;

import java.io.IOException;
import java.io.InputStream;
import java.net.URL;
import java.nio.charset.StandardCharsets;

public static String readURLToString(String url) throws IOException
{
    try (InputStream inputStream = new URL(url).openStream())
    {
        return IOUtils.toString(inputStream, StandardCharsets.UTF_8);
    }
}