i had accidently run 2 archiving jobs on the same data. For instance, job 1 was archiving for company code IN ( where the company code was from IN00 till ZZ00), which was the unwanted job. The second archive job archived data from IN99 till INZZ ( not the whole IN company code ).
These 2 jobs failed due to log fulll ( the data was too huge to be archived), however when i expand the jobs in the failed SARA session, the archive files has up to 100 MB size.
Below are some of the problems which will incur if we archive the same data more than once ( which i found from my online search )
- some archiving objects require that data only exists once in the archive therefore duplicate data can lead to erroneous results in the totals of archived data
- Archiving the data again will affect checksum. Checksum normally conducted before and after archiving process and its purpose is to validate whether the same file contents exist in the newly created archive files as compare the original data.
Could anyone advice me on how to overcome this multiple archiving on the same data issue. Apart from above stated impact, what are the other problems of multiple archiving on same data?
The failed archived sessions are currently in "Incomplete Archiving Session" and in 1 week time they will be deleted by archive delete jobs and will be moved to "Completed Archive Session". I highly appreciate if anyone could help
Source of finding:
Having the same data in 2 archives does not really hurt.
It would be a much bigger problem if you had already executed the the deletion step and had deleted data from the tables which shouldn't be archived.
the write step only duplicates the data from the tables into an archive (just in a different stucture).
so it is possible to delete the unwanted archive and only keep the archive with the correct archived data.
if you would keep both archives, then you can see the same record twice in an analysis which reads from both archive files (via index or info structures).
sales order 1000 is archived in 2 different archiving jobs, so you will find this sales order 1000 in archive file 1 and in archive file 2.
you create an index from both archive files.
Now you find this sales order 1000 two times in this index.
and if a sales analysis is made for a historic year beased on archived records, then you will have double business volume.
go into SARA and click hte management button and flag the archive as invalid before someone executes the deletion based on that file.
There are several issues here. In this case it seems pretty clear cut that you did not want the first variant to be executed. Hopefully none of the deletions have taken place for this archive run.
In cases where you have overlapping selection criteria and some of the deletions have been processed you can be in a very difficult situation. The best advise that I have would be to check your archive info structure CATALOG definition and make sure that both the archive file and the offset fields are set to DISPLAY fields and not KEY fields.
If your file and offset are key fields then when you use the archive info structure you would pull up more than one copy of the archived document.
Example: FI document 12345 was archived and deleted in archive run 1 and archive run 2.
The search for the archive info structure when the file and offset are keys fields would return two results.
12345 from run 1
12345 from run 2
If the CATALOG has the file and offset as display only fields you would only return one result
12345 from (whichever deletion file was processed first)
The second deletion process would have a warning message in the job log that not all records were inserted.
Please note that any direct access of the data archive file that bypasses the archive info structure and goes directly to the data archiving files would still show two documents and not a single document.