Copy verification and Compare issue

No more questions - please go to http://www.syncovery.com/forum instead

Moderator: SuperFlexible Administrators

Copy verification and Compare issue

Postby syncing_feeling on Wed May 05, 2010 9:35 am

Greetings,

I am trying out SFFS and I have some initial questions.

1) When SFFS performs any kind of copy, backup or sync operation, how does it (or does it) check to be sure that the copied file was successfully copied? I am wanting to protect against any kind of possible corruption, especially when copying over LAN and WAN. In EMC Retrospect, which I currently use (but not for WAN backups), there are options to have it thoroughly compare the copied files to the source files to be sure that the copy was successful. Does SFFS do anything like that? I see where it appears that SFFS can generate MD5 checksums, but that appears to only be for partial file updating, and that would only apply to files above a certain size, I believe.

2) I was running SFFS to copy some large files to a NAS on my LAN and my computer crashed (NOT because of SFFS--I'm having separate computer issues and working on getting a new computer thankfully). SFFS was naturally in the middle of copying a file which it did not complete because of the computer error. After rebooting and running SFFS again, it did not recognize that the file was incomplete even though the date and size were both incorrect. Shouldn't SFFS automatically see that this incomplete file was not a match and thus either replace it or try to do a partial file update (I tried running the profile both ways, plus also a 3rd time with Binary comparison)?

Thanks in advance for your help!
:)
syncing_feeling
 
Posts: 33
Joined: Wed May 05, 2010 9:14 am

Re: Copy verification and Compare issue

Postby superflexible on Wed May 05, 2010 9:43 am

Hello,

in our new version 5, which is currently available as a Preview build, the new option "Verify copied files" on the tab sheet "Files" will perform a complete and thorough verification of the copied data.

The incompletely copied file would be re-copied if you chose the "Exact Mirror" operating mode. Without it, it is not copied again because the timestamp on the NAS is newer than on the source side.

You can also avoid this problem by choosing the "Automatically resume" checkbox on the tab sheet Files->More. This will copy files with a temporary ".incomplete" extension until they are complete. If .incomplete files are left over from a previously interrupted sync, they are automatically resumed unless the source file has changed in between.
User avatar
superflexible
Site Admin
 
Posts: 2478
Joined: Thu Dec 31, 2009 3:08 pm

Re: Copy verification and Compare issue

Postby syncing_feeling on Wed May 05, 2010 10:37 am

Thanks for your quick reply!

"Exact Mirror" and "Automatically resume" were both extremely helpful! Thanks!

Regarding question #1: so how does SFFS normally know if a file was copied successfully?

1b) Regarding the version 5 feature "Verify copied files": will that be usable for remote backups over WAN? How will it work?

1c) When backing up over WAN is there currently (in version 4) some kind of check (like MD5) that the file copied correctly, or is there any way to achieve this with all files (not just big ones using Partial file updating)?

I should add that my WAN speed is not great: 20mb(download)/2mb(upload) on both ends.

Thanks!
:)
syncing_feeling
 
Posts: 33
Joined: Wed May 05, 2010 9:14 am

Re: Copy verification and Compare issue

Postby superflexible on Wed May 05, 2010 10:41 am

> Regarding question #1: so how does SFFS normally know if a file was copied successfully?

When the operating system does not report any error, the copy is fine in 99.999% of all cases.

> 1b) Regarding the version 5 feature "Verify copied files": will that be usable for remote
> backups over WAN? How will it work?

It will be usable but the complete file will be downloaded after upload and then compared. So the transfer time will be doubled.

A good alternative would be backing up to Amazon S3. In that case, you do not need the verify option because all uploads are automatically verified by MD5 checksum without having to re-download.

> 1c)
No, except for Amazon S3.
User avatar
superflexible
Site Admin
 
Posts: 2478
Joined: Thu Dec 31, 2009 3:08 pm

Re: Copy verification and Compare issue

Postby syncing_feeling on Wed May 05, 2010 11:43 am

When the operating system does not report any error, the copy is fine in 99.999% of all cases.

Okay, that is helpful.
It will be usable but the complete file will be downloaded after upload and then compared. So the transfer time will be doubled.

I'll have to think some about that. I could use that for some files, but other files will be as large as 1GB so that is not going to be realistic for use over WAN for those (LAN would be fine).
A good alternative would be backing up to Amazon S3. In that case, you do not need the verify option because all uploads are automatically verified by MD5 checksum without having to re-download.

I did not know that about Amazon S3; that is pretty cool. Unfortunately, I don't think that is an option for me. I am backing up >1TB from one NAS to another NAS; I'll do the initial backup over LAN and then future backups will run over WAN. Future backups will run maybe as often as 1x/day and total backup size will probably usually be around 500MB - 1GB (comprised of various files), but occasionally there will be backups with multiple files of approximately 1GB in size (so maybe total backup size of as much as 20GB); the larger backups will be less frequent.

In theory couldn't the ExtremeSync Remote Service be used to generate MD5 checksums for all files and not just Partial file updates? I understand that may not be in the plans, but I'm just trying to understand what the software can and cannot do.

Also, if I use Partial file updating, do I end up with a complete file at the destination? I don't mean to sound silly, but I'm just wondering if the file will be in one piece (and thus usable as is) or if it will be stored in chunks on the destination (and thus need to be restored via the software in order to be used)?

Perhaps it'd be helpful at this point to give an overview of what all I am trying to accomplish.
In addition to using SFFS for backups on my computer and on my LAN I am wanting to use it for remote backups (as you've gathered).

Here is my configuration:
At home:
-Home Computer1 (WinXP, soon to Win7 64bit) running SFFS.
-Home NAS (ReadyNAS NVX).
At office:
-Office Computer1 (WinXP), running ExtremeSync Remote Service (if needed).
-Office NAS1 (ReadyNAS NVX).
-Office NAS2 (ReadyNAS NV).
Home Computer1 can see Home NAS, Office NAS1 and Office NAS2 (the office NAS are accessible via a special VPN software called ReadyNAS Remote). Office Computer1 can see Office NAS1 and Office NAS2; Office Computer1 can see Home NAS1 if that is necessary. Note that currently I do NOT have the hardware/software necessary for Home Computer1 and Office Computer1 to see each other, but my impression is that that is not necessary with how the ExtremeSync Remote Service works (please correct me if I am wrong!). Also, files on the Office NAS' will not be knowingly modified; I might access them, but I would not be modifying them. So we could treat them as not being modified (because if I change a backup file instead of copying it to my computer it would be my own darn fault!).

At this point assume I'm not going to bother with any versioning (I might do versioning for some profiles, but I'll tackle that later). My goal is to regularly backup my Home NAS to the Office NAS'; some data from the Home NAS will go on Office NAS1 and other data will go on Office NAS2 (I'll manage this by how I set up my profiles).

I want to:
a) make the backups as bandwidth efficient as possible since I don't have very strong upload speed;
b) have the file/folder structure on the backups (ie, the Office NAS') match that of the source (ie, the Home NAS).
So the options of Exact Mirror, Automatically Resume, Partial File Updating and Zipping have sounded promising to me. It sounds like that the Remote Service only unzips zip packages, is that correct? When the Remote Service unzips zip packages, does it then place them in the correct folder structure? In the event of some kind of problem with my source files (eg, my Home NAS errors out, there is a fire, theft, etc.) I want to be able to go to my Office NAS' and access the files normally without having to translate them through any kind of backup software. I want to folder structure on the Office NAS' to match the corresponding folder structures on the Home NAS.

Your software is the only one that I have come across that can do partial file backups, recognize moved files (and not re-copy them) and is designed with remote backups in mind. In addition to all of my above questions, do you think that SFFS can do what I am wanting?

Thanks so much for your continued help!!
:D
syncing_feeling
 
Posts: 33
Joined: Wed May 05, 2010 9:14 am

Re: Copy verification and Compare issue

Postby Scott on Wed May 05, 2010 8:14 pm

superflexible wrote:the complete file will be downloaded after upload and then compared. So the transfer time will be doubled.

Not necessarily. I have an aggravatingly-slow upload speed, but excellent download speed, so transfer time will not be nearly doubled in such a case.
Scott
 
Posts: 12
Joined: Mon May 03, 2010 10:20 pm

Re: Copy verification and Compare issue

Postby superflexible on Thu May 06, 2010 2:48 am

In theory couldn't the ExtremeSync Remote Service be used to generate MD5 checksums?


Yes, I intend to make that possible in a future update.

Also, if I use Partial file updating, do I end up with a complete file at the destination?


There are two ways of Partial File Updating:
1) up to version 4, the destination file can only be modified directly and it cannot be zipped or versioned. So you always have only one complete file at the destination which matches the source file.

2) since version 5, you can also combine Partial File Updating with Zipping and Versioning. This is called Synthetic Backup. It leaves you with lots of zip files on the destination, each representing a day's changes (for example). You don't have any file on the destination that you can use directly, so you need to use the software to restore and reassemble your file.

Since you don't want 2), you can use Partial File Updating but you cannot combine it with Zipping.

a) make the backups as bandwidth efficient as possible since I don't have very strong upload speed;


For most files, zipping would be efficient, while Partial File Updating is only efficient for large database files which are modified frequently on a block level basis. The new version 5 uses smaller blocks and will be more efficient for Partial File Updating. But you may have much more success with just using zipping and forgetting about the partial stuff.

So the options of Exact Mirror, Automatically Resume, Partial File Updating and Zipping have sounded promising to me.


Again, you cannot combine Partial with Zipping, unless you want to end up with lots of daily changes in separate zip files, which you can do with version 5 (Synthetic Backup).

It sounds like that the Remote Service only unzips zip packages, is that correct?


Yes. The intention is to speed up the transfer of lots of smaller files by uploading them in a single ZIP file.

When the Remote Service unzips zip packages, does it then place them in the correct folder structure?


Yes.

In addition to all of my above questions, do you think that SFFS can do what I am wanting?


Yes.
User avatar
superflexible
Site Admin
 
Posts: 2478
Joined: Thu Dec 31, 2009 3:08 pm

Re: Copy verification and Compare issue

Postby syncing_feeling on Thu May 06, 2010 9:19 pm

In theory couldn't the ExtremeSync Remote Service be used to generate MD5 checksums?

Yes, I intend to make that possible in a future update.

That would be fantastic! I think having something like that is hugely important (to me, at least) for remote backups because of the risk of data corruption.

I actually just had an issue today with a file that was resumed (using the Resume setting) by SFFS after yet another system error (I can't wait to get my new computer!!!) where the file appeared to be complete, but when I tested it it was partially corrupted and a CRC check revealed the same. In this situation the file was on my LAN, but I could imagine the same thing happening during a WAN transfer where the connection was interrupted briefly. I may re-run all of the backups with the binary compare option as a way to check for any other possible corrupted files (I only caught this one because I happened to make note of it when resuming the backup after a system error, but I was not so careful on other errors out, unfortunately), or I might just use a MD5 file checker utility.

I would love to have a feature where the ExtremeSync Remote Service could generate a MD5 checksum for any file (including zip packages) that was sent as a way to verify that the file arrived complete. I am guessing that this is not planned for version 5 since it is not in the preview; is this a distant plan, or maybe within the next year? I understand if you don't want to get too specific about the timing; I'd just like some idea of how far off it might be.

Thanks for answering all of my questions thus far.

It does sound like using zip packages would probably be the way to go for me. My large files tend to not change much (except for the occasional renaming or moving, but SFFS should catch and efficiently deal with those, I think), so the partial file updating may not be that important for those after all (or for those that it is I could do a separate profile).

A few other questions:
In Exact Mirror Settings there is a "Delay the deletion by" setting; how does SFFS mark the file for deletion in the specified time period? I am excited to see this setting because I was thinking of requesting it; thanks in advance for fielding my question regarding how it works.

Does the Resume function (copying with a temporary filename) get used for zip packages too? If not, how does SFFS handle an interruption in sending a zip package? Also, currently there is not an MD5 check on zip packages, but just for partial file updates, correct?

I like to have my Windows taskbar at the top of my screen. Currently SFFS always starts at the very top of the screen and underneath my taskbar. This requires me to unlock and then lock the Windows taskbar to get SFFS to shift down. Any chance you could make SFFS (for version 5) to better respect the taskbar. I'd be happy if it just appeared in the center of the screen if that was easiest, but anything that was not underneath the taskbar would be great. Thanks in advance for considering this request!

Looking at my previous post, can you confirm that I should be able to use the ExtremeSync Remote Service on Office Computer1? [Edit: added the following for clarification.] Specifically, am I going to be able to configure the Remote Service to monitor Office NAS1 and Office NAS2? I'm trying to do a little testing here at home with a second computer and am having trouble figuring out the syntax for the "Computer Name (if remote)" field as well as how to set it up to monitor 2 NAS devices.

I am really appreciating how responsive you're being to my inquiries and am very hopeful about SFFS being what I am looking for.

Thanks!
:D
syncing_feeling
 
Posts: 33
Joined: Wed May 05, 2010 9:14 am

Re: Copy verification and Compare issue

Postby superflexible on Mon May 10, 2010 1:00 am

In Exact Mirror Settings there is a "Delay the deletion by" setting; how does SFFS mark the file for deletion in the specified time period?


It remembers the first time it saw the deletion in its database.

Does the Resume function (copying with a temporary filename) get used for zip packages too?


Yes, and uploads can be resumed even if the profile stops completely and is started again later.

Also, currently there is not an MD5 check on zip packages, but just for partial file updates, correct?


Yes.

Looking at my previous post, can you confirm that I should be able to use the ExtremeSync Remote Service on Office Computer1? Specifically, am I going to be able to configure the Remote Service to monitor Office NAS1 and Office NAS2


Yes, but the ExtremeSync Remote Service must be given a user account so that it can access the LAN. To do that, you need to go to Windows Control Panel->Administrative Tools->Services and edit the properties of the ExtremeSync Remote Service.
User avatar
superflexible
Site Admin
 
Posts: 2478
Joined: Thu Dec 31, 2009 3:08 pm

Re: Copy verification and Compare issue

Postby syncing_feeling on Wed May 12, 2010 11:04 am

Great! Thanks for answering my questions!

Yes, but the ExtremeSync Remote Service must be given a user account so that it can access the LAN. To do that, you need to go to Windows Control Panel->Administrative Tools->Services and edit the properties of the ExtremeSync Remote Service.

That worked great. For others reference, here's what I did:
-This is all on Win XP Pro.
-In the Properties window for the ExtremeSync Remote Service (via Admin Tools->Services as described above) -> Log On tab -> Log on as: This account -> selected Browse...
-> in Select User window -> selected Advanced -> selected Find Now -> selected on the user I wanted in the list and selected OK -> (you should now see something like COMPUTER_NAME\USER_NAME) selected OK again
-> we're now back at the Log On tab -> the This account: field now has .\USER_NAME in it -> type your password in the two password fields -> select Apply
-> select General tab -> select Stop (to stop service) -> select Start -> select OK -> you're done with setting the user account (unless I forgot something!).
-In the ExtremeSync Remote Service Control Panel: I left the Computer Name field empty -> on the Configure Unzipper tab and the Configure Checksummer and Remote Lister tab I entered my paths like so:
\\NAS1\SHARENAME\SUBFOLDER
\\NAS2\SHARENAME\SUBFOLDER
So far this has worked great (I haven't tested yet via ReadyNAS Remote; everything has been on my LAN thus far).

A few other questions (I always seem to have more! :shock: ):
-What happens if the computer which is using the Remote Service is temporarily off (eg, being rebooted or something) while a profile is running? I would assume that the profile would eventually error out due to not getting a response from the Remote Service, and I would assume that this is governed by the Access & Retries -> Waiting and Retrying options; is that correct?

-I assume that SFFS detects moved files using its database. Does it detect moved files in all Operation Modes, especially Exact Mirror?

-The BackupCreations site says that the Detect Moved Files feature works like this:
Our software creates a file list then does a comparison with file names and a time and date stamp. Then an MD5 checksum is done on the files to see if they are an exact match. If they are, they do a remote move instead of another file copy.

Is that how SFFS currently works, and if so, can the Remote Service handle the MD5 checksum creation? If not, any chance of that being added to version 5?

-Does the Binary Comparison feature basically compute MD5 checksums to compare? Any chance of the Remote Service being able to do that in version 5?

-Same questions regarding the new Verify Copied Files option: does it basically compute MD5 checksums to compare? Any chance of the Remote Service being able to do that in version 5?

You're probably noticing a pattern to some of my questions. ;) I am very, very interested in your software and especially the capabilities that the Remote Service offers. The main part of SFFS appears to be very versatile and powerful, and the Remote Service is the missing link that I have been looking for in backup software.

To sum up, here is what I want/need the Remote Service to be able to do (in addition to what it currently does):
1a) Be able to create MD5 checksums for all files (eg, regular files, zip files, zip packages, encrypted files, etc.).
1b) I don't know whether it is possible for file corruption to happen in the process of creating a Zip Package, so it might be safer if MD5 checksums were created and compared on the files individually after they were unpacked on the destination as opposed to just doing an MD5 checksum comparison on the Zip Package only.
2) Be able to create MD5 checksums for all occasions where file comparisons are being done (eg, Binary Comparison, Verify Copied Files, Detect Moved Files, etc.).

Basically, any function that would be slow over WAN I want to be able to have the Remote Service perform instead (although I can't think of anything else at the moment that is not included in the above Wish List).

Is it at all possible that the above features could make it in to version 5 (or at least 5.1)?

I hope that my persistent questions do not come across as demanding. Your software is the only software I have found that can (and hopefully will) do all of the things I am wanting/needing, which is extremely encouraging (I have been trying for years to find a way to efficiently backup large amounts of data offsite where the files don't end up in some proprietary format). I just am trying to get a sense of the possibility of and/or your projected timeline on these features.

Thanks yet again!
:D
syncing_feeling
 
Posts: 33
Joined: Wed May 05, 2010 9:14 am

Next

Return to Windows Support * new forum: www.syncovery.com/forum

cron