Make a ZIP archive flat
Your zip archive may have other zip archives inside it. One may want to extract the nested zip archive’s contents into the parent archive to get a flat structure.
Current archive structure
outer.zip ├first.txt ├inner.zip │ ├game.exe │ └subitem.bin └picture.gif
Desired archive structure
flatten.zip ├first.txt ├picture.gif ├game.exe └subitem.bin
If you are not familiar with Aspose.Zip read how to extract zip archive first.
Overal explanation
First, we need to list all the entries of the archive. Regular entries should be kept as they are, we should not even decompress them. Entries that are archives themselves need to be extracted to memory and removed from the outer archive. Their content needs to be included in the main archive.
Detecting entries that are archives
Lets decide which entries are archives itself. We can figure this out by extension of entry name. Later we will remove those entries from main archive, so keep such entries in a list.
1if(entry.getName().toLowerCase(Locale.ROOT).endsWith(".zip")) {
2 entriesToDelete.add(entry);
3 ...
4}
Extracting entry’s content to memory
Aspose.Zip allows extracting the content of zip entry into any writable stream, not only to a file. So, we can extract a nested archive to a memory stream.
Please note: the virtual memory must be big enough to keep all extracted content.
1byte[] b = new byte[8192];
2int bytesRead;
3InputStream entryStream = entry.open();
4ByteArrayOutputStream innerCompressed = new ByteArrayOutputStream();
5while (0 < (bytesRead = entryStream.read(b, 0, b.length))) {
6 innerCompressed.write(b, 0, bytesRead);
7}
After that innerCompressed stream contains the inner archive itself. The Archive constructor allows decompressing the stream provided. So, we can extract it as well:
1Archive inner = new Archive(new ByteArrayInputStream(innerCompressed.toByteArray()));
Excluding entries
We can remove an entry from zip archive with particular method.
1for(ArchiveEntry e : entriesToDelete) {
2 outer.deleteEntry(e);
3}
Put it all together
Here is the complete algorithm.
1try(Archive outer = new Archive("outer.zip")) {
2 ArrayList<ArchiveEntry> entriesToDelete = new ArrayList<ArchiveEntry>();
3 ArrayList<String> namesToInsert = new ArrayList<String>();
4 ArrayList<InputStream> contentToInsert = new ArrayList<InputStream>();
5 for(ArchiveEntry entry : outer.getEntries()) {
6 // Find an entry which is an archive itself
7 if(entry.getName().toLowerCase(Locale.ROOT).endsWith(".zip")) {
8 // Keep reference to the entry in order to remove it from the archive later
9 entriesToDelete.add(entry);
10
11 //This extracts the entry to a memory stream
12 byte[] b = new byte[8192];
13 int bytesRead;
14 InputStream entryStream = entry.open();
15 ByteArrayOutputStream innerCompressed = new ByteArrayOutputStream();
16 while (0 < (bytesRead = entryStream.read(b, 0, b.length))) {
17 innerCompressed.write(b, 0, bytesRead);
18 }
19
20 // We know that content of the entry is a zip archive, so we may extract
21 try(Archive inner = new Archive(new ByteArrayInputStream(innerCompressed.toByteArray()))) {
22
23 // Loop over entries of inner archive
24 for(ArchiveEntry ie : inner.getEntries()) {
25
26 // Keep the name of inner entry.
27 namesToInsert.add(ie.getName());
28
29 InputStream ieStream = ie.open();
30 ByteArrayOutputStream content = new ByteArrayOutputStream();
31 while (0 < (bytesRead = ieStream.read(b, 0, b.length))) {
32 content.write(b, 0, bytesRead);
33 }
34
35 // Keep the content of inner entry.
36 contentToInsert.add(new ByteArrayInputStream(content.toByteArray()));
37 }
38 }
39 }
40 }
41
42 for(ArchiveEntry e : entriesToDelete) {
43 // Delete all the entries which are archives itself
44 outer.deleteEntry(e);
45 }
46
47 for(int i = 0; i < namesToInsert.size(); i++) {
48 // Adds entries which were entries of inner archives
49 outer.createEntry(namesToInsert.get(i), contentToInsert.get(i));
50 }
51
52 outer.save("flatten.zip");
53} catch (Exception ex) {
54 System.out.println(ex);
55}