Advantages of the “Store” compression level in 7zipIs the “store” compression level always faster...
Why aren't there more gauls like Obelix?
Are there other characters in the Star Wars universe who had damaged bodies and needed to wear an outfit like Darth Vader?
Was it really inappropriate to write a pull request for the company I interviewed with?
Where do you go through passport control when transiting through another Schengen airport on your way out of the Schengen area?
How can I be pwned if I'm not registered on the compromised site?
What is the meaning of option 'by' in TikZ Intersections
Why are special aircraft used for the carriers in the United States Navy?
Integrating function with /; in its definition
What is the oldest European royal house?
Why do we call complex numbers “numbers” but we don’t consider 2 vectors numbers?
Should we avoid writing fiction about historical events without extensive research?
ESPP--any reason not to go all in?
Can a Mimic (container form) actually hold loot?
Named nets not connected in Eagle board design
Ultrafilters as a double dual
Sundering Titan and basic normal lands and snow lands
Affine transformation of circular arc in 3D
How to make sure I'm assertive enough in contact with subordinates?
Can a space-faring robot still function over a billion years?
What is better: yes / no radio, or simple checkbox?
What can I do if someone tampers with my SSH public key?
Rationale to prefer local variables over instance variables?
Is there a math equivalent to the conditional ternary operator?
I can't die. Who am I?
Advantages of the “Store” compression level in 7zip
Is the “store” compression level always faster than the rest?How easy is to recover uncompressed data if one sector is damaged in a 7zip archive?How do I know the gzip compression level?How to specify level of compression when using tar -zcvf?Check level of compression of an imageHow to achieve the compression level of a PDF scannerAvoiding extreme fragmentation of compressed system images on NTFSRe-zip archives with the compression level “No Compression”Does 7zip compression level affect decompression speed?Change Excel Zip Compression Level in WindowsStore disc images on a hard drive7zip equivalent to “zip -9” for maximum compression?
I have several thousand images in folders that I want to archive onto an external drive. When googling about compressions I came across the ability to set the compression level to "store" which means the data is not compressed.
Is there an advantage of doing this rather than just leaving the files in their uncompressed form in windows folders? Does it help with performance at all for a HDD?
compression
add a comment |
I have several thousand images in folders that I want to archive onto an external drive. When googling about compressions I came across the ability to set the compression level to "store" which means the data is not compressed.
Is there an advantage of doing this rather than just leaving the files in their uncompressed form in windows folders? Does it help with performance at all for a HDD?
compression
In the case of Android boot animations, it makes it so the system can read the data. I think it's just a way to "store" the data in one easily-transferable place.
– TheWanderer
Jun 29 '16 at 21:59
The biggest advantage of using "store", rather than compression, is that storing is faster than the "fastest" option.
– Martin
Dec 14 '18 at 6:45
add a comment |
I have several thousand images in folders that I want to archive onto an external drive. When googling about compressions I came across the ability to set the compression level to "store" which means the data is not compressed.
Is there an advantage of doing this rather than just leaving the files in their uncompressed form in windows folders? Does it help with performance at all for a HDD?
compression
I have several thousand images in folders that I want to archive onto an external drive. When googling about compressions I came across the ability to set the compression level to "store" which means the data is not compressed.
Is there an advantage of doing this rather than just leaving the files in their uncompressed form in windows folders? Does it help with performance at all for a HDD?
compression
compression
edited Jun 29 '16 at 22:17
ctrl-alt-delor
1,548823
1,548823
asked Jun 29 '16 at 21:58
JackJack
4717
4717
In the case of Android boot animations, it makes it so the system can read the data. I think it's just a way to "store" the data in one easily-transferable place.
– TheWanderer
Jun 29 '16 at 21:59
The biggest advantage of using "store", rather than compression, is that storing is faster than the "fastest" option.
– Martin
Dec 14 '18 at 6:45
add a comment |
In the case of Android boot animations, it makes it so the system can read the data. I think it's just a way to "store" the data in one easily-transferable place.
– TheWanderer
Jun 29 '16 at 21:59
The biggest advantage of using "store", rather than compression, is that storing is faster than the "fastest" option.
– Martin
Dec 14 '18 at 6:45
In the case of Android boot animations, it makes it so the system can read the data. I think it's just a way to "store" the data in one easily-transferable place.
– TheWanderer
Jun 29 '16 at 21:59
In the case of Android boot animations, it makes it so the system can read the data. I think it's just a way to "store" the data in one easily-transferable place.
– TheWanderer
Jun 29 '16 at 21:59
The biggest advantage of using "store", rather than compression, is that storing is faster than the "fastest" option.
– Martin
Dec 14 '18 at 6:45
The biggest advantage of using "store", rather than compression, is that storing is faster than the "fastest" option.
– Martin
Dec 14 '18 at 6:45
add a comment |
8 Answers
8
active
oldest
votes
Is there an advantage of doing this rather than just leaving the files in their uncompressed form in windows folders?
Yes.
As Keltari's answer notes, people may find it easier to work with one file than many. In practice, actually, if a person has a bunch of files, they can often just place the files into one folder, and then perform file operations (e.g., copy) on the folder. The general concept of an archive file that contains files, and the concept of a directory/folder that contains files, are quite similar. In fact, these concepts are so similar that Microsoft's support for ZIP files, built into the graphical interface of WinXP (and newer) and some Win9x systems with certain code added, was named "Microsoft Compressed Folders" in Microsoft's graphical interface.
Example: When I use SquirrelMail, web-based mail software, I can upload a file. I can upload multiple files, one at a time. I cannot just select a bunch of files and upload the batch. If I have 30 files to upload, I can just tell 7-Zip to compress the files using "store", so I don't waste a ton of time trying to compress the data much (if I know the data is uncompressable), and then I can just upload the one (compressed) file within SquirrelMail, easily.
Sometimes, some file extensions (e.g., .exe) might be forbidden, while archives might be permitted (by firewalls, anti-malware protection used by an E-Mail client, etc.)
However, there may be other advantages besides just apparently "ease of use" with some software.. If the file archive format contains a file integrity hash for compressed data, then file integrity can be checked when data is accessed. This can result in detecting errors that might not be detected if the file archive format wasn't used.
Of course, in theory, a filesystem could contain metadata that stores a file hash. The difference here is that filesystems don't typically contain that type of data, while archives do. So, even if filesystems could have that data, they typically don't (at least, not traditionally with many older filesystem types).
Another reason why the "store" method is commonly implemented by archive software is that it is very easy to program. So, there is little downside in making it an available option.
If data is backed up, then the archive will typically contain a timestamp which can be an easy way to note a time that the included files are older than. Directories/folders might not have the same sort of timestamps. Or, they might. With different filesystem types (e.g., NTFS vs. exFAT vs. Ext3 vs. Btrfs vs. ISO9660) and different operating systems implementing those filesystems, and sometimes filesystem types having multiple dates (creation/modification/access), people may be disinclined to trust that a directory's date actually reflects when the contents were updated (instead of some other meaning, like when the directory was created, renamed, or had permissions altered, but not necessarily data modification). An archive file's timestamp, especially if that time is part of the filename, is commonly trustworthy.
Does it help with performance at all for a HDD?
Hopefully not. After all, such stored files typically have overhead (from some data called a "header"), so the archived data is often going to be slightly slower, not faster. However, exceptions could exist: it could be faster.
Sometimes, some code would locate a file, which would take a long time (possibly because it's basically sorting through a large number of files). After performing a file operation (copy/delete/whatever), then locating the next file would take a long time. Such problems can often be avoided by using software, including filesystem drivers, which are optimized to handle such situations. However, in other cases, such situations have been known to occur. Copying one large file would often not have the exact same cost. (Then again, at least historically, sometimes dealing with a large file might have a significant cost, which could be an even greater cost.)
The biggest advantage of using store, rather than compression, is that storing is faster. This is because time is required to be taken in order to perform the calculations needed to do the data compression.
A lot of this perception was based on older technology. In reality, compression could save time, if the CPU is sufficiently fast (so that compressing data doesn't take much time) and if the data is compressed enough that less data needs to be written to / read from a disk. Fast CPU compression of larger data, plus slow writing of compressed data, may be faster than slow writing of uncompressed data.
There can be other factors too, like less usage ("wear and tear") of more fragile equipment (like hard drives).
Whether compressing (and storing compressed data) or storing (uncompressed data) is faster depends on: the speed of compressing, the effectiveness of compressing (just how much smaller does the data become after the compression is performed), and the speed of writing/reading the larger amount of data. The results tend to vary over time, based on differences in CPU speed, algorithm effectiveness (different algorithms, and possibly different options being used for those algorithms), and storage speed.
In general, decompression has often been much faster than compression (because it simply re-creates data based on known results, and doesn't involve as much exploration/guessing), so if you have to write data once and then read it many times, compression is very often worthwhile. For other cases, many people don't find benefit in using compression.
Because CPU power is sufficiently faster than historically times, store does seem to be getting used less. (People often tolerate the cost for at least the minimal/fast forms of compression.) However, archive programs (like 7-Zip) often want to keep supporting "store" so that people can still access (extract/modify) archives that use the store technique, and because it could be helpful for some people (on old systems), and because it can be useful for other tasks (creating a combination of data quickly, without wasting time trying to compress data that is unlikely to compress well), and because storing is a simple process so there is little incentive to remove it, the option tends to remain available.
add a comment |
Assuming that you may be occasionally accessing individual files from the external drive (say they're travel photos), there is no reason to compress them into a single archive.
These don't really apply to your case, but in general there are a few advantages to using a 'store' compression method to group multiple files into a single archive for archival or network transfer:
Easier to manage a single file if sending attachments via email or copying to USB for distribution. e.g. you could archive travel photos based on the trip, then its trivial to copy/mail the right archive to others on the same trip without forgetting to include some pics (or mixing in others).
Avoid file transfer overhead: Negotiation protocols when doing a network file transfer can add significant overhead to transferring each file.
Less space wastage on block devices: This was a significant issue long ago when FAT file system had 32kB block sizes (so, even a 500b icon will take 32kB on disk). Nowadays the block size should be 4kB or less, and the wastage is usually a trivial non-factor.
Storing non-compressible data into an archive won't help with HDD performance, except for mostly insignificant stuff like OS having to check individual file permissions vs a single permission for entire archive taking a bit longer etc.
add a comment |
I will assume that you are asking about the zip archiver.
Setting compression level to store, allows you to put all the files in to one archive (file), but not compress it.
- The advantage over leaving in a directory hierarchy is that it is now one file, so could be easier to manage e.g. if sending via an email.
- The advantages over compressing as well are:
- If you store data that is already compressed (such as most image formats e.g. jpeg, png), the file may grow if you try to compress, and is a lot of processing.
- If you store the archive in another archive / repository, it may result in better compression, if it is all compressed by the outer archive/repository.
- If you store it in a revision control system, then being able to see changes between revisions, will result in an overall smaller repository.
1
I agree with the rest, but why would it make more sense to use a 'store'd archive in a RCS instead of individual files?
– Alok
Jun 30 '16 at 1:33
@alok Some file-formats use zip, e.g. open document (odt
,odp
,ods
), and mircosoft's office. These are not the best formats to use in a revision control system, but uncompressing them helps. Mercurial (hg
), has an extensiondoczip
that can do this automatically (you can add new file extensions if you wish, but the common ones are pre configured).
– ctrl-alt-delor
Jun 30 '16 at 8:35
add a comment |
Using an archive does offer some advantages. It makes user file management easier. Do you want to move/copy/backup those files? It is far easier to move one file, than several thousand. Simply put, less is easier to manage than more for a person.
Also, when it comes to compressing thousands of image files, you might get little compression if the files are .JPG, or any other type of already compressed file. You would spend a long time compressing them into a single archive, with little space savings.
It doesnt offer much in the way of performance. Yes, it is faster to index one file, than several thousand. However, indexing doesnt happen often, and a few thousand files isnt many.
add a comment |
Noncompressed archives are less likely to be completely wrecked if there's any data corruption. As I wrote in an existing answer, 7zip is able to extract all files from the archive even if the checksums for some don't match. The data stored in the space affected by the corruption will still be destroyed, of course, but the rest of the file containing the damaged run is still recoverable.
If you used the old method of LZW compression, for instance, all of a file's data after a damaged section would be impossible to recover. Even if just one byte was zero'd, the dictionary of the decompressor would not match the dictionary of the compressor, and everything that came out after the error would be trash. (More likely, the decompressor would crash.) Other compression algorithms may be moderately less sensitive to corruption, but it's trivial to salvage a noncompressed archive even manually.
add a comment |
only adding to the other answers.
If multiple file items will fit in the same "clusters" (block quantity that the file system writes in), it will use less disk space.
Each file item is stored in seperated clusters in FAT and NTFS systems, if a file only takes 1.2 clusters it will therin use 2 clusters. If the grouped file takes 120.2 clusters it will take 121 clusters to store the file. The file system clusters.
If items are grouped together as a single archive, database, zip, disk image, and stored as a single file , that single file will take up clusters to store it in less wasted space when grouped as a single file.
Every file will have some small ammount of wasted space, one huge file will also have only one small ammount of wasted space too.
To better and simply discover this cluster usage space, (in windows) do a properties on a set of files or folders, and observe the "size" and the "size on disk" the size on disk represents the total cluster space required to store the files including the wasted space. The smaller the cluster size, the less waste there is.
On the other hand, smaller cluster sizes tend to be slower for large data. depends on how you set the cluster size, or if you sized it specific for the size/type of data that would be stored in that partition.
In most scenarios with todays common stored data , the data is already using some forms of compression. It is less wise to compress and create "dependancy" archive items, which using compression or not are harder to recover parts and pieces of.
Example Trying to fix a corrupt database with 2% error, vrses recovering 98% of your files as seperated items. (database recover can be messy or can have software thought up for that exact purpose).
Unless the data is to be archived specific (is a backup method), or to be transferred simply across the internet , packaged for distribution, or the data can be highly compressed, it can be better usually to keep file items seperated, not packaged grouped or further compressed or even encrypted if that is not nessisary. Less complications (dependencies) less software and work needed to package and unpackage, better/easier chance of recovery of parts of it on failure.
Lets take the example of using compare routines for 1000 smaller file items or 1 huge archive. say your compare routine says there is 3 bits of data that are incorrect on either one. In one case you have 3 bad files out of 1000, in the grouped case you have 1000 files in a group that something is wrong with it :-)
Adding any complications to data completly unnessisarily, does not assist the user in understanding what is there, backing it up to other sources, insuring it never got corrupted, or trying to recover what of it is recoverable , if something fails.
One contiguous block of any data that is sequential (has to be defragged completly first) is faster for the hard disk to access that data. Any additional routines that the computer has to go through, vary in work required to do them, and how much faster that could possibly be.
Compressable files can potentially be (much) faster to read out , with decompression needed, even when there is much more work required to do so.
Files that there is no large additional size reduction (already compressed), will just have more work to do, and be more complicated. It would just depend on where you want to go today if that was advantageous to you, given that it can also be more work and less access/visual for the user.
Databases, archives, disk images, and other huge blocks of combined small data compressed or not, can be accessed faster, sequentially and also using cpu processes designed to work with many small items in a more efficient or speedier (using work) way. Where would we be without databases? some things would be terribly slow and disorganised messes.
Conclusion: IMO Unless there is a serious need to compress or encrypt or group or package, unless there is a distibution need, or way to archive and backup as another copy of it, already compressed data should have less complications, not more. The space savings should be mitigated with proper cluster size. The speed should be mitigated with proper defragging. Anytime there is corruption or a need to recover data , or even understand it, it can be better that it is simple.
For business, and the web and database accesses and packaging and distribution, the methods used are great, speedy helpful and managable.
For normal users storing thier already compressed photos and videos, instead it is the backups and multiple copies of that data that will always be more important than piddling with packing it for speed or even for disk space savings.
So back it up, however, before worrying about it being a bit faster.
add a comment |
There is actually another advantage to archives over "regular" folders. If you happen to crash your disk, or any other reason to use a low-level file recoverer (e.g. TestDisk+PhotoRec), you'd be happy to recover "coherent" archives instead of messed up files oblivious of any folder structure.
add a comment |
I recommend compression if you would also like to securely 'lock' a folder with a password (recommended 15-20 characters long if using 256 bit encryption). I do this for my porn folders, which are substantially large and also benefit from being compressed as it creates extra room on the drive. It also allows for much faster transfer of the folder between drives.
New contributor
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "3"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1095142%2fadvantages-of-the-store-compression-level-in-7zip%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
8 Answers
8
active
oldest
votes
8 Answers
8
active
oldest
votes
active
oldest
votes
active
oldest
votes
Is there an advantage of doing this rather than just leaving the files in their uncompressed form in windows folders?
Yes.
As Keltari's answer notes, people may find it easier to work with one file than many. In practice, actually, if a person has a bunch of files, they can often just place the files into one folder, and then perform file operations (e.g., copy) on the folder. The general concept of an archive file that contains files, and the concept of a directory/folder that contains files, are quite similar. In fact, these concepts are so similar that Microsoft's support for ZIP files, built into the graphical interface of WinXP (and newer) and some Win9x systems with certain code added, was named "Microsoft Compressed Folders" in Microsoft's graphical interface.
Example: When I use SquirrelMail, web-based mail software, I can upload a file. I can upload multiple files, one at a time. I cannot just select a bunch of files and upload the batch. If I have 30 files to upload, I can just tell 7-Zip to compress the files using "store", so I don't waste a ton of time trying to compress the data much (if I know the data is uncompressable), and then I can just upload the one (compressed) file within SquirrelMail, easily.
Sometimes, some file extensions (e.g., .exe) might be forbidden, while archives might be permitted (by firewalls, anti-malware protection used by an E-Mail client, etc.)
However, there may be other advantages besides just apparently "ease of use" with some software.. If the file archive format contains a file integrity hash for compressed data, then file integrity can be checked when data is accessed. This can result in detecting errors that might not be detected if the file archive format wasn't used.
Of course, in theory, a filesystem could contain metadata that stores a file hash. The difference here is that filesystems don't typically contain that type of data, while archives do. So, even if filesystems could have that data, they typically don't (at least, not traditionally with many older filesystem types).
Another reason why the "store" method is commonly implemented by archive software is that it is very easy to program. So, there is little downside in making it an available option.
If data is backed up, then the archive will typically contain a timestamp which can be an easy way to note a time that the included files are older than. Directories/folders might not have the same sort of timestamps. Or, they might. With different filesystem types (e.g., NTFS vs. exFAT vs. Ext3 vs. Btrfs vs. ISO9660) and different operating systems implementing those filesystems, and sometimes filesystem types having multiple dates (creation/modification/access), people may be disinclined to trust that a directory's date actually reflects when the contents were updated (instead of some other meaning, like when the directory was created, renamed, or had permissions altered, but not necessarily data modification). An archive file's timestamp, especially if that time is part of the filename, is commonly trustworthy.
Does it help with performance at all for a HDD?
Hopefully not. After all, such stored files typically have overhead (from some data called a "header"), so the archived data is often going to be slightly slower, not faster. However, exceptions could exist: it could be faster.
Sometimes, some code would locate a file, which would take a long time (possibly because it's basically sorting through a large number of files). After performing a file operation (copy/delete/whatever), then locating the next file would take a long time. Such problems can often be avoided by using software, including filesystem drivers, which are optimized to handle such situations. However, in other cases, such situations have been known to occur. Copying one large file would often not have the exact same cost. (Then again, at least historically, sometimes dealing with a large file might have a significant cost, which could be an even greater cost.)
The biggest advantage of using store, rather than compression, is that storing is faster. This is because time is required to be taken in order to perform the calculations needed to do the data compression.
A lot of this perception was based on older technology. In reality, compression could save time, if the CPU is sufficiently fast (so that compressing data doesn't take much time) and if the data is compressed enough that less data needs to be written to / read from a disk. Fast CPU compression of larger data, plus slow writing of compressed data, may be faster than slow writing of uncompressed data.
There can be other factors too, like less usage ("wear and tear") of more fragile equipment (like hard drives).
Whether compressing (and storing compressed data) or storing (uncompressed data) is faster depends on: the speed of compressing, the effectiveness of compressing (just how much smaller does the data become after the compression is performed), and the speed of writing/reading the larger amount of data. The results tend to vary over time, based on differences in CPU speed, algorithm effectiveness (different algorithms, and possibly different options being used for those algorithms), and storage speed.
In general, decompression has often been much faster than compression (because it simply re-creates data based on known results, and doesn't involve as much exploration/guessing), so if you have to write data once and then read it many times, compression is very often worthwhile. For other cases, many people don't find benefit in using compression.
Because CPU power is sufficiently faster than historically times, store does seem to be getting used less. (People often tolerate the cost for at least the minimal/fast forms of compression.) However, archive programs (like 7-Zip) often want to keep supporting "store" so that people can still access (extract/modify) archives that use the store technique, and because it could be helpful for some people (on old systems), and because it can be useful for other tasks (creating a combination of data quickly, without wasting time trying to compress data that is unlikely to compress well), and because storing is a simple process so there is little incentive to remove it, the option tends to remain available.
add a comment |
Is there an advantage of doing this rather than just leaving the files in their uncompressed form in windows folders?
Yes.
As Keltari's answer notes, people may find it easier to work with one file than many. In practice, actually, if a person has a bunch of files, they can often just place the files into one folder, and then perform file operations (e.g., copy) on the folder. The general concept of an archive file that contains files, and the concept of a directory/folder that contains files, are quite similar. In fact, these concepts are so similar that Microsoft's support for ZIP files, built into the graphical interface of WinXP (and newer) and some Win9x systems with certain code added, was named "Microsoft Compressed Folders" in Microsoft's graphical interface.
Example: When I use SquirrelMail, web-based mail software, I can upload a file. I can upload multiple files, one at a time. I cannot just select a bunch of files and upload the batch. If I have 30 files to upload, I can just tell 7-Zip to compress the files using "store", so I don't waste a ton of time trying to compress the data much (if I know the data is uncompressable), and then I can just upload the one (compressed) file within SquirrelMail, easily.
Sometimes, some file extensions (e.g., .exe) might be forbidden, while archives might be permitted (by firewalls, anti-malware protection used by an E-Mail client, etc.)
However, there may be other advantages besides just apparently "ease of use" with some software.. If the file archive format contains a file integrity hash for compressed data, then file integrity can be checked when data is accessed. This can result in detecting errors that might not be detected if the file archive format wasn't used.
Of course, in theory, a filesystem could contain metadata that stores a file hash. The difference here is that filesystems don't typically contain that type of data, while archives do. So, even if filesystems could have that data, they typically don't (at least, not traditionally with many older filesystem types).
Another reason why the "store" method is commonly implemented by archive software is that it is very easy to program. So, there is little downside in making it an available option.
If data is backed up, then the archive will typically contain a timestamp which can be an easy way to note a time that the included files are older than. Directories/folders might not have the same sort of timestamps. Or, they might. With different filesystem types (e.g., NTFS vs. exFAT vs. Ext3 vs. Btrfs vs. ISO9660) and different operating systems implementing those filesystems, and sometimes filesystem types having multiple dates (creation/modification/access), people may be disinclined to trust that a directory's date actually reflects when the contents were updated (instead of some other meaning, like when the directory was created, renamed, or had permissions altered, but not necessarily data modification). An archive file's timestamp, especially if that time is part of the filename, is commonly trustworthy.
Does it help with performance at all for a HDD?
Hopefully not. After all, such stored files typically have overhead (from some data called a "header"), so the archived data is often going to be slightly slower, not faster. However, exceptions could exist: it could be faster.
Sometimes, some code would locate a file, which would take a long time (possibly because it's basically sorting through a large number of files). After performing a file operation (copy/delete/whatever), then locating the next file would take a long time. Such problems can often be avoided by using software, including filesystem drivers, which are optimized to handle such situations. However, in other cases, such situations have been known to occur. Copying one large file would often not have the exact same cost. (Then again, at least historically, sometimes dealing with a large file might have a significant cost, which could be an even greater cost.)
The biggest advantage of using store, rather than compression, is that storing is faster. This is because time is required to be taken in order to perform the calculations needed to do the data compression.
A lot of this perception was based on older technology. In reality, compression could save time, if the CPU is sufficiently fast (so that compressing data doesn't take much time) and if the data is compressed enough that less data needs to be written to / read from a disk. Fast CPU compression of larger data, plus slow writing of compressed data, may be faster than slow writing of uncompressed data.
There can be other factors too, like less usage ("wear and tear") of more fragile equipment (like hard drives).
Whether compressing (and storing compressed data) or storing (uncompressed data) is faster depends on: the speed of compressing, the effectiveness of compressing (just how much smaller does the data become after the compression is performed), and the speed of writing/reading the larger amount of data. The results tend to vary over time, based on differences in CPU speed, algorithm effectiveness (different algorithms, and possibly different options being used for those algorithms), and storage speed.
In general, decompression has often been much faster than compression (because it simply re-creates data based on known results, and doesn't involve as much exploration/guessing), so if you have to write data once and then read it many times, compression is very often worthwhile. For other cases, many people don't find benefit in using compression.
Because CPU power is sufficiently faster than historically times, store does seem to be getting used less. (People often tolerate the cost for at least the minimal/fast forms of compression.) However, archive programs (like 7-Zip) often want to keep supporting "store" so that people can still access (extract/modify) archives that use the store technique, and because it could be helpful for some people (on old systems), and because it can be useful for other tasks (creating a combination of data quickly, without wasting time trying to compress data that is unlikely to compress well), and because storing is a simple process so there is little incentive to remove it, the option tends to remain available.
add a comment |
Is there an advantage of doing this rather than just leaving the files in their uncompressed form in windows folders?
Yes.
As Keltari's answer notes, people may find it easier to work with one file than many. In practice, actually, if a person has a bunch of files, they can often just place the files into one folder, and then perform file operations (e.g., copy) on the folder. The general concept of an archive file that contains files, and the concept of a directory/folder that contains files, are quite similar. In fact, these concepts are so similar that Microsoft's support for ZIP files, built into the graphical interface of WinXP (and newer) and some Win9x systems with certain code added, was named "Microsoft Compressed Folders" in Microsoft's graphical interface.
Example: When I use SquirrelMail, web-based mail software, I can upload a file. I can upload multiple files, one at a time. I cannot just select a bunch of files and upload the batch. If I have 30 files to upload, I can just tell 7-Zip to compress the files using "store", so I don't waste a ton of time trying to compress the data much (if I know the data is uncompressable), and then I can just upload the one (compressed) file within SquirrelMail, easily.
Sometimes, some file extensions (e.g., .exe) might be forbidden, while archives might be permitted (by firewalls, anti-malware protection used by an E-Mail client, etc.)
However, there may be other advantages besides just apparently "ease of use" with some software.. If the file archive format contains a file integrity hash for compressed data, then file integrity can be checked when data is accessed. This can result in detecting errors that might not be detected if the file archive format wasn't used.
Of course, in theory, a filesystem could contain metadata that stores a file hash. The difference here is that filesystems don't typically contain that type of data, while archives do. So, even if filesystems could have that data, they typically don't (at least, not traditionally with many older filesystem types).
Another reason why the "store" method is commonly implemented by archive software is that it is very easy to program. So, there is little downside in making it an available option.
If data is backed up, then the archive will typically contain a timestamp which can be an easy way to note a time that the included files are older than. Directories/folders might not have the same sort of timestamps. Or, they might. With different filesystem types (e.g., NTFS vs. exFAT vs. Ext3 vs. Btrfs vs. ISO9660) and different operating systems implementing those filesystems, and sometimes filesystem types having multiple dates (creation/modification/access), people may be disinclined to trust that a directory's date actually reflects when the contents were updated (instead of some other meaning, like when the directory was created, renamed, or had permissions altered, but not necessarily data modification). An archive file's timestamp, especially if that time is part of the filename, is commonly trustworthy.
Does it help with performance at all for a HDD?
Hopefully not. After all, such stored files typically have overhead (from some data called a "header"), so the archived data is often going to be slightly slower, not faster. However, exceptions could exist: it could be faster.
Sometimes, some code would locate a file, which would take a long time (possibly because it's basically sorting through a large number of files). After performing a file operation (copy/delete/whatever), then locating the next file would take a long time. Such problems can often be avoided by using software, including filesystem drivers, which are optimized to handle such situations. However, in other cases, such situations have been known to occur. Copying one large file would often not have the exact same cost. (Then again, at least historically, sometimes dealing with a large file might have a significant cost, which could be an even greater cost.)
The biggest advantage of using store, rather than compression, is that storing is faster. This is because time is required to be taken in order to perform the calculations needed to do the data compression.
A lot of this perception was based on older technology. In reality, compression could save time, if the CPU is sufficiently fast (so that compressing data doesn't take much time) and if the data is compressed enough that less data needs to be written to / read from a disk. Fast CPU compression of larger data, plus slow writing of compressed data, may be faster than slow writing of uncompressed data.
There can be other factors too, like less usage ("wear and tear") of more fragile equipment (like hard drives).
Whether compressing (and storing compressed data) or storing (uncompressed data) is faster depends on: the speed of compressing, the effectiveness of compressing (just how much smaller does the data become after the compression is performed), and the speed of writing/reading the larger amount of data. The results tend to vary over time, based on differences in CPU speed, algorithm effectiveness (different algorithms, and possibly different options being used for those algorithms), and storage speed.
In general, decompression has often been much faster than compression (because it simply re-creates data based on known results, and doesn't involve as much exploration/guessing), so if you have to write data once and then read it many times, compression is very often worthwhile. For other cases, many people don't find benefit in using compression.
Because CPU power is sufficiently faster than historically times, store does seem to be getting used less. (People often tolerate the cost for at least the minimal/fast forms of compression.) However, archive programs (like 7-Zip) often want to keep supporting "store" so that people can still access (extract/modify) archives that use the store technique, and because it could be helpful for some people (on old systems), and because it can be useful for other tasks (creating a combination of data quickly, without wasting time trying to compress data that is unlikely to compress well), and because storing is a simple process so there is little incentive to remove it, the option tends to remain available.
Is there an advantage of doing this rather than just leaving the files in their uncompressed form in windows folders?
Yes.
As Keltari's answer notes, people may find it easier to work with one file than many. In practice, actually, if a person has a bunch of files, they can often just place the files into one folder, and then perform file operations (e.g., copy) on the folder. The general concept of an archive file that contains files, and the concept of a directory/folder that contains files, are quite similar. In fact, these concepts are so similar that Microsoft's support for ZIP files, built into the graphical interface of WinXP (and newer) and some Win9x systems with certain code added, was named "Microsoft Compressed Folders" in Microsoft's graphical interface.
Example: When I use SquirrelMail, web-based mail software, I can upload a file. I can upload multiple files, one at a time. I cannot just select a bunch of files and upload the batch. If I have 30 files to upload, I can just tell 7-Zip to compress the files using "store", so I don't waste a ton of time trying to compress the data much (if I know the data is uncompressable), and then I can just upload the one (compressed) file within SquirrelMail, easily.
Sometimes, some file extensions (e.g., .exe) might be forbidden, while archives might be permitted (by firewalls, anti-malware protection used by an E-Mail client, etc.)
However, there may be other advantages besides just apparently "ease of use" with some software.. If the file archive format contains a file integrity hash for compressed data, then file integrity can be checked when data is accessed. This can result in detecting errors that might not be detected if the file archive format wasn't used.
Of course, in theory, a filesystem could contain metadata that stores a file hash. The difference here is that filesystems don't typically contain that type of data, while archives do. So, even if filesystems could have that data, they typically don't (at least, not traditionally with many older filesystem types).
Another reason why the "store" method is commonly implemented by archive software is that it is very easy to program. So, there is little downside in making it an available option.
If data is backed up, then the archive will typically contain a timestamp which can be an easy way to note a time that the included files are older than. Directories/folders might not have the same sort of timestamps. Or, they might. With different filesystem types (e.g., NTFS vs. exFAT vs. Ext3 vs. Btrfs vs. ISO9660) and different operating systems implementing those filesystems, and sometimes filesystem types having multiple dates (creation/modification/access), people may be disinclined to trust that a directory's date actually reflects when the contents were updated (instead of some other meaning, like when the directory was created, renamed, or had permissions altered, but not necessarily data modification). An archive file's timestamp, especially if that time is part of the filename, is commonly trustworthy.
Does it help with performance at all for a HDD?
Hopefully not. After all, such stored files typically have overhead (from some data called a "header"), so the archived data is often going to be slightly slower, not faster. However, exceptions could exist: it could be faster.
Sometimes, some code would locate a file, which would take a long time (possibly because it's basically sorting through a large number of files). After performing a file operation (copy/delete/whatever), then locating the next file would take a long time. Such problems can often be avoided by using software, including filesystem drivers, which are optimized to handle such situations. However, in other cases, such situations have been known to occur. Copying one large file would often not have the exact same cost. (Then again, at least historically, sometimes dealing with a large file might have a significant cost, which could be an even greater cost.)
The biggest advantage of using store, rather than compression, is that storing is faster. This is because time is required to be taken in order to perform the calculations needed to do the data compression.
A lot of this perception was based on older technology. In reality, compression could save time, if the CPU is sufficiently fast (so that compressing data doesn't take much time) and if the data is compressed enough that less data needs to be written to / read from a disk. Fast CPU compression of larger data, plus slow writing of compressed data, may be faster than slow writing of uncompressed data.
There can be other factors too, like less usage ("wear and tear") of more fragile equipment (like hard drives).
Whether compressing (and storing compressed data) or storing (uncompressed data) is faster depends on: the speed of compressing, the effectiveness of compressing (just how much smaller does the data become after the compression is performed), and the speed of writing/reading the larger amount of data. The results tend to vary over time, based on differences in CPU speed, algorithm effectiveness (different algorithms, and possibly different options being used for those algorithms), and storage speed.
In general, decompression has often been much faster than compression (because it simply re-creates data based on known results, and doesn't involve as much exploration/guessing), so if you have to write data once and then read it many times, compression is very often worthwhile. For other cases, many people don't find benefit in using compression.
Because CPU power is sufficiently faster than historically times, store does seem to be getting used less. (People often tolerate the cost for at least the minimal/fast forms of compression.) However, archive programs (like 7-Zip) often want to keep supporting "store" so that people can still access (extract/modify) archives that use the store technique, and because it could be helpful for some people (on old systems), and because it can be useful for other tasks (creating a combination of data quickly, without wasting time trying to compress data that is unlikely to compress well), and because storing is a simple process so there is little incentive to remove it, the option tends to remain available.
answered Jun 30 '16 at 7:42
TOOGAMTOOGAM
11.5k32646
11.5k32646
add a comment |
add a comment |
Assuming that you may be occasionally accessing individual files from the external drive (say they're travel photos), there is no reason to compress them into a single archive.
These don't really apply to your case, but in general there are a few advantages to using a 'store' compression method to group multiple files into a single archive for archival or network transfer:
Easier to manage a single file if sending attachments via email or copying to USB for distribution. e.g. you could archive travel photos based on the trip, then its trivial to copy/mail the right archive to others on the same trip without forgetting to include some pics (or mixing in others).
Avoid file transfer overhead: Negotiation protocols when doing a network file transfer can add significant overhead to transferring each file.
Less space wastage on block devices: This was a significant issue long ago when FAT file system had 32kB block sizes (so, even a 500b icon will take 32kB on disk). Nowadays the block size should be 4kB or less, and the wastage is usually a trivial non-factor.
Storing non-compressible data into an archive won't help with HDD performance, except for mostly insignificant stuff like OS having to check individual file permissions vs a single permission for entire archive taking a bit longer etc.
add a comment |
Assuming that you may be occasionally accessing individual files from the external drive (say they're travel photos), there is no reason to compress them into a single archive.
These don't really apply to your case, but in general there are a few advantages to using a 'store' compression method to group multiple files into a single archive for archival or network transfer:
Easier to manage a single file if sending attachments via email or copying to USB for distribution. e.g. you could archive travel photos based on the trip, then its trivial to copy/mail the right archive to others on the same trip without forgetting to include some pics (or mixing in others).
Avoid file transfer overhead: Negotiation protocols when doing a network file transfer can add significant overhead to transferring each file.
Less space wastage on block devices: This was a significant issue long ago when FAT file system had 32kB block sizes (so, even a 500b icon will take 32kB on disk). Nowadays the block size should be 4kB or less, and the wastage is usually a trivial non-factor.
Storing non-compressible data into an archive won't help with HDD performance, except for mostly insignificant stuff like OS having to check individual file permissions vs a single permission for entire archive taking a bit longer etc.
add a comment |
Assuming that you may be occasionally accessing individual files from the external drive (say they're travel photos), there is no reason to compress them into a single archive.
These don't really apply to your case, but in general there are a few advantages to using a 'store' compression method to group multiple files into a single archive for archival or network transfer:
Easier to manage a single file if sending attachments via email or copying to USB for distribution. e.g. you could archive travel photos based on the trip, then its trivial to copy/mail the right archive to others on the same trip without forgetting to include some pics (or mixing in others).
Avoid file transfer overhead: Negotiation protocols when doing a network file transfer can add significant overhead to transferring each file.
Less space wastage on block devices: This was a significant issue long ago when FAT file system had 32kB block sizes (so, even a 500b icon will take 32kB on disk). Nowadays the block size should be 4kB or less, and the wastage is usually a trivial non-factor.
Storing non-compressible data into an archive won't help with HDD performance, except for mostly insignificant stuff like OS having to check individual file permissions vs a single permission for entire archive taking a bit longer etc.
Assuming that you may be occasionally accessing individual files from the external drive (say they're travel photos), there is no reason to compress them into a single archive.
These don't really apply to your case, but in general there are a few advantages to using a 'store' compression method to group multiple files into a single archive for archival or network transfer:
Easier to manage a single file if sending attachments via email or copying to USB for distribution. e.g. you could archive travel photos based on the trip, then its trivial to copy/mail the right archive to others on the same trip without forgetting to include some pics (or mixing in others).
Avoid file transfer overhead: Negotiation protocols when doing a network file transfer can add significant overhead to transferring each file.
Less space wastage on block devices: This was a significant issue long ago when FAT file system had 32kB block sizes (so, even a 500b icon will take 32kB on disk). Nowadays the block size should be 4kB or less, and the wastage is usually a trivial non-factor.
Storing non-compressible data into an archive won't help with HDD performance, except for mostly insignificant stuff like OS having to check individual file permissions vs a single permission for entire archive taking a bit longer etc.
answered Jun 30 '16 at 2:01
AlokAlok
446210
446210
add a comment |
add a comment |
I will assume that you are asking about the zip archiver.
Setting compression level to store, allows you to put all the files in to one archive (file), but not compress it.
- The advantage over leaving in a directory hierarchy is that it is now one file, so could be easier to manage e.g. if sending via an email.
- The advantages over compressing as well are:
- If you store data that is already compressed (such as most image formats e.g. jpeg, png), the file may grow if you try to compress, and is a lot of processing.
- If you store the archive in another archive / repository, it may result in better compression, if it is all compressed by the outer archive/repository.
- If you store it in a revision control system, then being able to see changes between revisions, will result in an overall smaller repository.
1
I agree with the rest, but why would it make more sense to use a 'store'd archive in a RCS instead of individual files?
– Alok
Jun 30 '16 at 1:33
@alok Some file-formats use zip, e.g. open document (odt
,odp
,ods
), and mircosoft's office. These are not the best formats to use in a revision control system, but uncompressing them helps. Mercurial (hg
), has an extensiondoczip
that can do this automatically (you can add new file extensions if you wish, but the common ones are pre configured).
– ctrl-alt-delor
Jun 30 '16 at 8:35
add a comment |
I will assume that you are asking about the zip archiver.
Setting compression level to store, allows you to put all the files in to one archive (file), but not compress it.
- The advantage over leaving in a directory hierarchy is that it is now one file, so could be easier to manage e.g. if sending via an email.
- The advantages over compressing as well are:
- If you store data that is already compressed (such as most image formats e.g. jpeg, png), the file may grow if you try to compress, and is a lot of processing.
- If you store the archive in another archive / repository, it may result in better compression, if it is all compressed by the outer archive/repository.
- If you store it in a revision control system, then being able to see changes between revisions, will result in an overall smaller repository.
1
I agree with the rest, but why would it make more sense to use a 'store'd archive in a RCS instead of individual files?
– Alok
Jun 30 '16 at 1:33
@alok Some file-formats use zip, e.g. open document (odt
,odp
,ods
), and mircosoft's office. These are not the best formats to use in a revision control system, but uncompressing them helps. Mercurial (hg
), has an extensiondoczip
that can do this automatically (you can add new file extensions if you wish, but the common ones are pre configured).
– ctrl-alt-delor
Jun 30 '16 at 8:35
add a comment |
I will assume that you are asking about the zip archiver.
Setting compression level to store, allows you to put all the files in to one archive (file), but not compress it.
- The advantage over leaving in a directory hierarchy is that it is now one file, so could be easier to manage e.g. if sending via an email.
- The advantages over compressing as well are:
- If you store data that is already compressed (such as most image formats e.g. jpeg, png), the file may grow if you try to compress, and is a lot of processing.
- If you store the archive in another archive / repository, it may result in better compression, if it is all compressed by the outer archive/repository.
- If you store it in a revision control system, then being able to see changes between revisions, will result in an overall smaller repository.
I will assume that you are asking about the zip archiver.
Setting compression level to store, allows you to put all the files in to one archive (file), but not compress it.
- The advantage over leaving in a directory hierarchy is that it is now one file, so could be easier to manage e.g. if sending via an email.
- The advantages over compressing as well are:
- If you store data that is already compressed (such as most image formats e.g. jpeg, png), the file may grow if you try to compress, and is a lot of processing.
- If you store the archive in another archive / repository, it may result in better compression, if it is all compressed by the outer archive/repository.
- If you store it in a revision control system, then being able to see changes between revisions, will result in an overall smaller repository.
edited Jun 29 '16 at 22:59
answered Jun 29 '16 at 22:10
ctrl-alt-delorctrl-alt-delor
1,548823
1,548823
1
I agree with the rest, but why would it make more sense to use a 'store'd archive in a RCS instead of individual files?
– Alok
Jun 30 '16 at 1:33
@alok Some file-formats use zip, e.g. open document (odt
,odp
,ods
), and mircosoft's office. These are not the best formats to use in a revision control system, but uncompressing them helps. Mercurial (hg
), has an extensiondoczip
that can do this automatically (you can add new file extensions if you wish, but the common ones are pre configured).
– ctrl-alt-delor
Jun 30 '16 at 8:35
add a comment |
1
I agree with the rest, but why would it make more sense to use a 'store'd archive in a RCS instead of individual files?
– Alok
Jun 30 '16 at 1:33
@alok Some file-formats use zip, e.g. open document (odt
,odp
,ods
), and mircosoft's office. These are not the best formats to use in a revision control system, but uncompressing them helps. Mercurial (hg
), has an extensiondoczip
that can do this automatically (you can add new file extensions if you wish, but the common ones are pre configured).
– ctrl-alt-delor
Jun 30 '16 at 8:35
1
1
I agree with the rest, but why would it make more sense to use a 'store'd archive in a RCS instead of individual files?
– Alok
Jun 30 '16 at 1:33
I agree with the rest, but why would it make more sense to use a 'store'd archive in a RCS instead of individual files?
– Alok
Jun 30 '16 at 1:33
@alok Some file-formats use zip, e.g. open document (
odt
, odp
, ods
), and mircosoft's office. These are not the best formats to use in a revision control system, but uncompressing them helps. Mercurial (hg
), has an extension doczip
that can do this automatically (you can add new file extensions if you wish, but the common ones are pre configured).– ctrl-alt-delor
Jun 30 '16 at 8:35
@alok Some file-formats use zip, e.g. open document (
odt
, odp
, ods
), and mircosoft's office. These are not the best formats to use in a revision control system, but uncompressing them helps. Mercurial (hg
), has an extension doczip
that can do this automatically (you can add new file extensions if you wish, but the common ones are pre configured).– ctrl-alt-delor
Jun 30 '16 at 8:35
add a comment |
Using an archive does offer some advantages. It makes user file management easier. Do you want to move/copy/backup those files? It is far easier to move one file, than several thousand. Simply put, less is easier to manage than more for a person.
Also, when it comes to compressing thousands of image files, you might get little compression if the files are .JPG, or any other type of already compressed file. You would spend a long time compressing them into a single archive, with little space savings.
It doesnt offer much in the way of performance. Yes, it is faster to index one file, than several thousand. However, indexing doesnt happen often, and a few thousand files isnt many.
add a comment |
Using an archive does offer some advantages. It makes user file management easier. Do you want to move/copy/backup those files? It is far easier to move one file, than several thousand. Simply put, less is easier to manage than more for a person.
Also, when it comes to compressing thousands of image files, you might get little compression if the files are .JPG, or any other type of already compressed file. You would spend a long time compressing them into a single archive, with little space savings.
It doesnt offer much in the way of performance. Yes, it is faster to index one file, than several thousand. However, indexing doesnt happen often, and a few thousand files isnt many.
add a comment |
Using an archive does offer some advantages. It makes user file management easier. Do you want to move/copy/backup those files? It is far easier to move one file, than several thousand. Simply put, less is easier to manage than more for a person.
Also, when it comes to compressing thousands of image files, you might get little compression if the files are .JPG, or any other type of already compressed file. You would spend a long time compressing them into a single archive, with little space savings.
It doesnt offer much in the way of performance. Yes, it is faster to index one file, than several thousand. However, indexing doesnt happen often, and a few thousand files isnt many.
Using an archive does offer some advantages. It makes user file management easier. Do you want to move/copy/backup those files? It is far easier to move one file, than several thousand. Simply put, less is easier to manage than more for a person.
Also, when it comes to compressing thousands of image files, you might get little compression if the files are .JPG, or any other type of already compressed file. You would spend a long time compressing them into a single archive, with little space savings.
It doesnt offer much in the way of performance. Yes, it is faster to index one file, than several thousand. However, indexing doesnt happen often, and a few thousand files isnt many.
answered Jun 29 '16 at 22:35
KeltariKeltari
51.4k18119170
51.4k18119170
add a comment |
add a comment |
Noncompressed archives are less likely to be completely wrecked if there's any data corruption. As I wrote in an existing answer, 7zip is able to extract all files from the archive even if the checksums for some don't match. The data stored in the space affected by the corruption will still be destroyed, of course, but the rest of the file containing the damaged run is still recoverable.
If you used the old method of LZW compression, for instance, all of a file's data after a damaged section would be impossible to recover. Even if just one byte was zero'd, the dictionary of the decompressor would not match the dictionary of the compressor, and everything that came out after the error would be trash. (More likely, the decompressor would crash.) Other compression algorithms may be moderately less sensitive to corruption, but it's trivial to salvage a noncompressed archive even manually.
add a comment |
Noncompressed archives are less likely to be completely wrecked if there's any data corruption. As I wrote in an existing answer, 7zip is able to extract all files from the archive even if the checksums for some don't match. The data stored in the space affected by the corruption will still be destroyed, of course, but the rest of the file containing the damaged run is still recoverable.
If you used the old method of LZW compression, for instance, all of a file's data after a damaged section would be impossible to recover. Even if just one byte was zero'd, the dictionary of the decompressor would not match the dictionary of the compressor, and everything that came out after the error would be trash. (More likely, the decompressor would crash.) Other compression algorithms may be moderately less sensitive to corruption, but it's trivial to salvage a noncompressed archive even manually.
add a comment |
Noncompressed archives are less likely to be completely wrecked if there's any data corruption. As I wrote in an existing answer, 7zip is able to extract all files from the archive even if the checksums for some don't match. The data stored in the space affected by the corruption will still be destroyed, of course, but the rest of the file containing the damaged run is still recoverable.
If you used the old method of LZW compression, for instance, all of a file's data after a damaged section would be impossible to recover. Even if just one byte was zero'd, the dictionary of the decompressor would not match the dictionary of the compressor, and everything that came out after the error would be trash. (More likely, the decompressor would crash.) Other compression algorithms may be moderately less sensitive to corruption, but it's trivial to salvage a noncompressed archive even manually.
Noncompressed archives are less likely to be completely wrecked if there's any data corruption. As I wrote in an existing answer, 7zip is able to extract all files from the archive even if the checksums for some don't match. The data stored in the space affected by the corruption will still be destroyed, of course, but the rest of the file containing the damaged run is still recoverable.
If you used the old method of LZW compression, for instance, all of a file's data after a damaged section would be impossible to recover. Even if just one byte was zero'd, the dictionary of the decompressor would not match the dictionary of the compressor, and everything that came out after the error would be trash. (More likely, the decompressor would crash.) Other compression algorithms may be moderately less sensitive to corruption, but it's trivial to salvage a noncompressed archive even manually.
edited Mar 20 '17 at 10:17
Community♦
1
1
answered Jun 30 '16 at 0:01
Ben NBen N
29.8k1398145
29.8k1398145
add a comment |
add a comment |
only adding to the other answers.
If multiple file items will fit in the same "clusters" (block quantity that the file system writes in), it will use less disk space.
Each file item is stored in seperated clusters in FAT and NTFS systems, if a file only takes 1.2 clusters it will therin use 2 clusters. If the grouped file takes 120.2 clusters it will take 121 clusters to store the file. The file system clusters.
If items are grouped together as a single archive, database, zip, disk image, and stored as a single file , that single file will take up clusters to store it in less wasted space when grouped as a single file.
Every file will have some small ammount of wasted space, one huge file will also have only one small ammount of wasted space too.
To better and simply discover this cluster usage space, (in windows) do a properties on a set of files or folders, and observe the "size" and the "size on disk" the size on disk represents the total cluster space required to store the files including the wasted space. The smaller the cluster size, the less waste there is.
On the other hand, smaller cluster sizes tend to be slower for large data. depends on how you set the cluster size, or if you sized it specific for the size/type of data that would be stored in that partition.
In most scenarios with todays common stored data , the data is already using some forms of compression. It is less wise to compress and create "dependancy" archive items, which using compression or not are harder to recover parts and pieces of.
Example Trying to fix a corrupt database with 2% error, vrses recovering 98% of your files as seperated items. (database recover can be messy or can have software thought up for that exact purpose).
Unless the data is to be archived specific (is a backup method), or to be transferred simply across the internet , packaged for distribution, or the data can be highly compressed, it can be better usually to keep file items seperated, not packaged grouped or further compressed or even encrypted if that is not nessisary. Less complications (dependencies) less software and work needed to package and unpackage, better/easier chance of recovery of parts of it on failure.
Lets take the example of using compare routines for 1000 smaller file items or 1 huge archive. say your compare routine says there is 3 bits of data that are incorrect on either one. In one case you have 3 bad files out of 1000, in the grouped case you have 1000 files in a group that something is wrong with it :-)
Adding any complications to data completly unnessisarily, does not assist the user in understanding what is there, backing it up to other sources, insuring it never got corrupted, or trying to recover what of it is recoverable , if something fails.
One contiguous block of any data that is sequential (has to be defragged completly first) is faster for the hard disk to access that data. Any additional routines that the computer has to go through, vary in work required to do them, and how much faster that could possibly be.
Compressable files can potentially be (much) faster to read out , with decompression needed, even when there is much more work required to do so.
Files that there is no large additional size reduction (already compressed), will just have more work to do, and be more complicated. It would just depend on where you want to go today if that was advantageous to you, given that it can also be more work and less access/visual for the user.
Databases, archives, disk images, and other huge blocks of combined small data compressed or not, can be accessed faster, sequentially and also using cpu processes designed to work with many small items in a more efficient or speedier (using work) way. Where would we be without databases? some things would be terribly slow and disorganised messes.
Conclusion: IMO Unless there is a serious need to compress or encrypt or group or package, unless there is a distibution need, or way to archive and backup as another copy of it, already compressed data should have less complications, not more. The space savings should be mitigated with proper cluster size. The speed should be mitigated with proper defragging. Anytime there is corruption or a need to recover data , or even understand it, it can be better that it is simple.
For business, and the web and database accesses and packaging and distribution, the methods used are great, speedy helpful and managable.
For normal users storing thier already compressed photos and videos, instead it is the backups and multiple copies of that data that will always be more important than piddling with packing it for speed or even for disk space savings.
So back it up, however, before worrying about it being a bit faster.
add a comment |
only adding to the other answers.
If multiple file items will fit in the same "clusters" (block quantity that the file system writes in), it will use less disk space.
Each file item is stored in seperated clusters in FAT and NTFS systems, if a file only takes 1.2 clusters it will therin use 2 clusters. If the grouped file takes 120.2 clusters it will take 121 clusters to store the file. The file system clusters.
If items are grouped together as a single archive, database, zip, disk image, and stored as a single file , that single file will take up clusters to store it in less wasted space when grouped as a single file.
Every file will have some small ammount of wasted space, one huge file will also have only one small ammount of wasted space too.
To better and simply discover this cluster usage space, (in windows) do a properties on a set of files or folders, and observe the "size" and the "size on disk" the size on disk represents the total cluster space required to store the files including the wasted space. The smaller the cluster size, the less waste there is.
On the other hand, smaller cluster sizes tend to be slower for large data. depends on how you set the cluster size, or if you sized it specific for the size/type of data that would be stored in that partition.
In most scenarios with todays common stored data , the data is already using some forms of compression. It is less wise to compress and create "dependancy" archive items, which using compression or not are harder to recover parts and pieces of.
Example Trying to fix a corrupt database with 2% error, vrses recovering 98% of your files as seperated items. (database recover can be messy or can have software thought up for that exact purpose).
Unless the data is to be archived specific (is a backup method), or to be transferred simply across the internet , packaged for distribution, or the data can be highly compressed, it can be better usually to keep file items seperated, not packaged grouped or further compressed or even encrypted if that is not nessisary. Less complications (dependencies) less software and work needed to package and unpackage, better/easier chance of recovery of parts of it on failure.
Lets take the example of using compare routines for 1000 smaller file items or 1 huge archive. say your compare routine says there is 3 bits of data that are incorrect on either one. In one case you have 3 bad files out of 1000, in the grouped case you have 1000 files in a group that something is wrong with it :-)
Adding any complications to data completly unnessisarily, does not assist the user in understanding what is there, backing it up to other sources, insuring it never got corrupted, or trying to recover what of it is recoverable , if something fails.
One contiguous block of any data that is sequential (has to be defragged completly first) is faster for the hard disk to access that data. Any additional routines that the computer has to go through, vary in work required to do them, and how much faster that could possibly be.
Compressable files can potentially be (much) faster to read out , with decompression needed, even when there is much more work required to do so.
Files that there is no large additional size reduction (already compressed), will just have more work to do, and be more complicated. It would just depend on where you want to go today if that was advantageous to you, given that it can also be more work and less access/visual for the user.
Databases, archives, disk images, and other huge blocks of combined small data compressed or not, can be accessed faster, sequentially and also using cpu processes designed to work with many small items in a more efficient or speedier (using work) way. Where would we be without databases? some things would be terribly slow and disorganised messes.
Conclusion: IMO Unless there is a serious need to compress or encrypt or group or package, unless there is a distibution need, or way to archive and backup as another copy of it, already compressed data should have less complications, not more. The space savings should be mitigated with proper cluster size. The speed should be mitigated with proper defragging. Anytime there is corruption or a need to recover data , or even understand it, it can be better that it is simple.
For business, and the web and database accesses and packaging and distribution, the methods used are great, speedy helpful and managable.
For normal users storing thier already compressed photos and videos, instead it is the backups and multiple copies of that data that will always be more important than piddling with packing it for speed or even for disk space savings.
So back it up, however, before worrying about it being a bit faster.
add a comment |
only adding to the other answers.
If multiple file items will fit in the same "clusters" (block quantity that the file system writes in), it will use less disk space.
Each file item is stored in seperated clusters in FAT and NTFS systems, if a file only takes 1.2 clusters it will therin use 2 clusters. If the grouped file takes 120.2 clusters it will take 121 clusters to store the file. The file system clusters.
If items are grouped together as a single archive, database, zip, disk image, and stored as a single file , that single file will take up clusters to store it in less wasted space when grouped as a single file.
Every file will have some small ammount of wasted space, one huge file will also have only one small ammount of wasted space too.
To better and simply discover this cluster usage space, (in windows) do a properties on a set of files or folders, and observe the "size" and the "size on disk" the size on disk represents the total cluster space required to store the files including the wasted space. The smaller the cluster size, the less waste there is.
On the other hand, smaller cluster sizes tend to be slower for large data. depends on how you set the cluster size, or if you sized it specific for the size/type of data that would be stored in that partition.
In most scenarios with todays common stored data , the data is already using some forms of compression. It is less wise to compress and create "dependancy" archive items, which using compression or not are harder to recover parts and pieces of.
Example Trying to fix a corrupt database with 2% error, vrses recovering 98% of your files as seperated items. (database recover can be messy or can have software thought up for that exact purpose).
Unless the data is to be archived specific (is a backup method), or to be transferred simply across the internet , packaged for distribution, or the data can be highly compressed, it can be better usually to keep file items seperated, not packaged grouped or further compressed or even encrypted if that is not nessisary. Less complications (dependencies) less software and work needed to package and unpackage, better/easier chance of recovery of parts of it on failure.
Lets take the example of using compare routines for 1000 smaller file items or 1 huge archive. say your compare routine says there is 3 bits of data that are incorrect on either one. In one case you have 3 bad files out of 1000, in the grouped case you have 1000 files in a group that something is wrong with it :-)
Adding any complications to data completly unnessisarily, does not assist the user in understanding what is there, backing it up to other sources, insuring it never got corrupted, or trying to recover what of it is recoverable , if something fails.
One contiguous block of any data that is sequential (has to be defragged completly first) is faster for the hard disk to access that data. Any additional routines that the computer has to go through, vary in work required to do them, and how much faster that could possibly be.
Compressable files can potentially be (much) faster to read out , with decompression needed, even when there is much more work required to do so.
Files that there is no large additional size reduction (already compressed), will just have more work to do, and be more complicated. It would just depend on where you want to go today if that was advantageous to you, given that it can also be more work and less access/visual for the user.
Databases, archives, disk images, and other huge blocks of combined small data compressed or not, can be accessed faster, sequentially and also using cpu processes designed to work with many small items in a more efficient or speedier (using work) way. Where would we be without databases? some things would be terribly slow and disorganised messes.
Conclusion: IMO Unless there is a serious need to compress or encrypt or group or package, unless there is a distibution need, or way to archive and backup as another copy of it, already compressed data should have less complications, not more. The space savings should be mitigated with proper cluster size. The speed should be mitigated with proper defragging. Anytime there is corruption or a need to recover data , or even understand it, it can be better that it is simple.
For business, and the web and database accesses and packaging and distribution, the methods used are great, speedy helpful and managable.
For normal users storing thier already compressed photos and videos, instead it is the backups and multiple copies of that data that will always be more important than piddling with packing it for speed or even for disk space savings.
So back it up, however, before worrying about it being a bit faster.
only adding to the other answers.
If multiple file items will fit in the same "clusters" (block quantity that the file system writes in), it will use less disk space.
Each file item is stored in seperated clusters in FAT and NTFS systems, if a file only takes 1.2 clusters it will therin use 2 clusters. If the grouped file takes 120.2 clusters it will take 121 clusters to store the file. The file system clusters.
If items are grouped together as a single archive, database, zip, disk image, and stored as a single file , that single file will take up clusters to store it in less wasted space when grouped as a single file.
Every file will have some small ammount of wasted space, one huge file will also have only one small ammount of wasted space too.
To better and simply discover this cluster usage space, (in windows) do a properties on a set of files or folders, and observe the "size" and the "size on disk" the size on disk represents the total cluster space required to store the files including the wasted space. The smaller the cluster size, the less waste there is.
On the other hand, smaller cluster sizes tend to be slower for large data. depends on how you set the cluster size, or if you sized it specific for the size/type of data that would be stored in that partition.
In most scenarios with todays common stored data , the data is already using some forms of compression. It is less wise to compress and create "dependancy" archive items, which using compression or not are harder to recover parts and pieces of.
Example Trying to fix a corrupt database with 2% error, vrses recovering 98% of your files as seperated items. (database recover can be messy or can have software thought up for that exact purpose).
Unless the data is to be archived specific (is a backup method), or to be transferred simply across the internet , packaged for distribution, or the data can be highly compressed, it can be better usually to keep file items seperated, not packaged grouped or further compressed or even encrypted if that is not nessisary. Less complications (dependencies) less software and work needed to package and unpackage, better/easier chance of recovery of parts of it on failure.
Lets take the example of using compare routines for 1000 smaller file items or 1 huge archive. say your compare routine says there is 3 bits of data that are incorrect on either one. In one case you have 3 bad files out of 1000, in the grouped case you have 1000 files in a group that something is wrong with it :-)
Adding any complications to data completly unnessisarily, does not assist the user in understanding what is there, backing it up to other sources, insuring it never got corrupted, or trying to recover what of it is recoverable , if something fails.
One contiguous block of any data that is sequential (has to be defragged completly first) is faster for the hard disk to access that data. Any additional routines that the computer has to go through, vary in work required to do them, and how much faster that could possibly be.
Compressable files can potentially be (much) faster to read out , with decompression needed, even when there is much more work required to do so.
Files that there is no large additional size reduction (already compressed), will just have more work to do, and be more complicated. It would just depend on where you want to go today if that was advantageous to you, given that it can also be more work and less access/visual for the user.
Databases, archives, disk images, and other huge blocks of combined small data compressed or not, can be accessed faster, sequentially and also using cpu processes designed to work with many small items in a more efficient or speedier (using work) way. Where would we be without databases? some things would be terribly slow and disorganised messes.
Conclusion: IMO Unless there is a serious need to compress or encrypt or group or package, unless there is a distibution need, or way to archive and backup as another copy of it, already compressed data should have less complications, not more. The space savings should be mitigated with proper cluster size. The speed should be mitigated with proper defragging. Anytime there is corruption or a need to recover data , or even understand it, it can be better that it is simple.
For business, and the web and database accesses and packaging and distribution, the methods used are great, speedy helpful and managable.
For normal users storing thier already compressed photos and videos, instead it is the backups and multiple copies of that data that will always be more important than piddling with packing it for speed or even for disk space savings.
So back it up, however, before worrying about it being a bit faster.
edited Jun 30 '16 at 0:32
answered Jun 29 '16 at 23:30
PsycogeekPsycogeek
7,35263970
7,35263970
add a comment |
add a comment |
There is actually another advantage to archives over "regular" folders. If you happen to crash your disk, or any other reason to use a low-level file recoverer (e.g. TestDisk+PhotoRec), you'd be happy to recover "coherent" archives instead of messed up files oblivious of any folder structure.
add a comment |
There is actually another advantage to archives over "regular" folders. If you happen to crash your disk, or any other reason to use a low-level file recoverer (e.g. TestDisk+PhotoRec), you'd be happy to recover "coherent" archives instead of messed up files oblivious of any folder structure.
add a comment |
There is actually another advantage to archives over "regular" folders. If you happen to crash your disk, or any other reason to use a low-level file recoverer (e.g. TestDisk+PhotoRec), you'd be happy to recover "coherent" archives instead of messed up files oblivious of any folder structure.
There is actually another advantage to archives over "regular" folders. If you happen to crash your disk, or any other reason to use a low-level file recoverer (e.g. TestDisk+PhotoRec), you'd be happy to recover "coherent" archives instead of messed up files oblivious of any folder structure.
answered Nov 8 '16 at 10:12
OlivierMOlivierM
11
11
add a comment |
add a comment |
I recommend compression if you would also like to securely 'lock' a folder with a password (recommended 15-20 characters long if using 256 bit encryption). I do this for my porn folders, which are substantially large and also benefit from being compressed as it creates extra room on the drive. It also allows for much faster transfer of the folder between drives.
New contributor
add a comment |
I recommend compression if you would also like to securely 'lock' a folder with a password (recommended 15-20 characters long if using 256 bit encryption). I do this for my porn folders, which are substantially large and also benefit from being compressed as it creates extra room on the drive. It also allows for much faster transfer of the folder between drives.
New contributor
add a comment |
I recommend compression if you would also like to securely 'lock' a folder with a password (recommended 15-20 characters long if using 256 bit encryption). I do this for my porn folders, which are substantially large and also benefit from being compressed as it creates extra room on the drive. It also allows for much faster transfer of the folder between drives.
New contributor
I recommend compression if you would also like to securely 'lock' a folder with a password (recommended 15-20 characters long if using 256 bit encryption). I do this for my porn folders, which are substantially large and also benefit from being compressed as it creates extra room on the drive. It also allows for much faster transfer of the folder between drives.
New contributor
New contributor
answered 23 hours ago
perilperil
1
1
New contributor
New contributor
add a comment |
add a comment |
Thanks for contributing an answer to Super User!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1095142%2fadvantages-of-the-store-compression-level-in-7zip%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
In the case of Android boot animations, it makes it so the system can read the data. I think it's just a way to "store" the data in one easily-transferable place.
– TheWanderer
Jun 29 '16 at 21:59
The biggest advantage of using "store", rather than compression, is that storing is faster than the "fastest" option.
– Martin
Dec 14 '18 at 6:45