Read Content from Files which are inside Zip file
Solution 1:
If you're wondering how to get the file content from each ZipEntry
it's actually quite simple. Here's a sample code:
public static void main(String[] args) throws IOException {
ZipFile zipFile = new ZipFile("C:/test.zip");
Enumeration<? extends ZipEntry> entries = zipFile.entries();
while(entries.hasMoreElements()){
ZipEntry entry = entries.nextElement();
InputStream stream = zipFile.getInputStream(entry);
}
}
Once you have the InputStream you can read it however you want.
Solution 2:
As of Java 7, the NIO Api provides a better and more generic way of accessing the contents of Zip or Jar files. Actually, it is now a unified API which allows you to treat Zip files exactly like normal files.
In order to extract all of the files contained inside of a zip file in this API, you'd do this:
In Java 8:
private void extractAll(URI fromZip, Path toDirectory) throws IOException{
FileSystems.newFileSystem(fromZip, Collections.emptyMap())
.getRootDirectories()
.forEach(root -> {
// in a full implementation, you'd have to
// handle directories
Files.walk(root).forEach(path -> Files.copy(path, toDirectory));
});
}
In java 7:
private void extractAll(URI fromZip, Path toDirectory) throws IOException{
FileSystem zipFs = FileSystems.newFileSystem(fromZip, Collections.emptyMap());
for(Path root : zipFs.getRootDirectories()) {
Files.walkFileTree(root, new SimpleFileVisitor<Path>() {
@Override
public FileVisitResult visitFile(Path file, BasicFileAttributes attrs)
throws IOException {
// You can do anything you want with the path here
Files.copy(file, toDirectory);
return FileVisitResult.CONTINUE;
}
@Override
public FileVisitResult preVisitDirectory(Path dir, BasicFileAttributes attrs)
throws IOException {
// In a full implementation, you'd need to create each
// sub-directory of the destination directory before
// copying files into it
return super.preVisitDirectory(dir, attrs);
}
});
}
}
Solution 3:
Because of the condition in while
, the loop might never break:
while (entry != null) {
// If entry never becomes null here, loop will never break.
}
Instead of the null
check there, you can try this:
ZipEntry entry = null;
while ((entry = zip.getNextEntry()) != null) {
// Rest of your code
}
Solution 4:
Sample code you can use to let Tika take care of container files for you. http://wiki.apache.org/tika/RecursiveMetadata
Form what I can tell, the accepted solution will not work for cases where there are nested zip files. Tika, however will take care of such situations as well.