Administration Guide : Trigger Set Configuration : Archiving to disk using Xinet (The movetree Action)

Archiving to disk using Xinet (The movetree Action)
In this section:
More on the movetree Action
This movetree Action shuttles files and directories between one path in the file system and another, maintaining any underlying directory structure throughout operations. This is useful for archive-to-disk implementations where you want to move a set of data from a live production area to an “archive” area, then sometime later, copy it back.
The movetree Action requires two arguments: namely, the two paths between which you want to move a tree of directories and subdirectories. The following figure shows how you specify the paths:
How the movetree Action works
When triggered, the movetree Action determines whether it has been triggered from the Copypath or the Movepath. What it does varies accordingly.
If it is triggered from the path you specified in the Movepath argument, it copies the directory structure into the Copypath location by substituting the path in the Copypath argument for the one in the Movepath box.
Incidentally, you may also use custom keyword values in the Movepath Argument. To do so, choose Path from the Movepath pop-up list rather than Browse, so you can enter the path. Then enter the correct path in the field, appending $KEYWORD###_VALUE at the end of the path, where XXX is the Data Field’s keyword value. If you do not know the value, the easiest way to look it up is to check the Database, Data Fields, Summary page. The ID column displays the number for each Data Field.
Using a custom keyword value, for example, you might enter a path like:
/Volumes/Production/Customers/$KEYWORD137_VALUE
on a system where a Customer Data Field had been established with an ID=137.
If, in this example, $KEYWORD137_VALUE = Acme, then the Movepath would be /Volumes/Production/Customers/Acme.
After successfully copying all the files, the Action removes the original files from the Movepath, leaving that location empty for the time being. (Copying first, then verifying success before removing the files, rather than simply moving them, ensures that no data will be lost should something go wrong during the transfer.)
If the destination already exists in the Copypath location, the movetree Action will append increasing integers to the destination until the path becomes unique—in effect, implementing a versioning strategy.
If the movetree Action is triggered within the path in the Copypath box, it will copy files from that Copypath location to the Movepath location, with one major exception: if the path already exists in the Movepath location, the Action will fail; thus, preventing unintended work loss caused by overwriting.
If a movetree Action were to be triggered within an area that does not start with either the specified Movepath or Copypath, it would fail.
In addition, if either the Movepath or Copypath were to lie outside of the Trigger Set’s Active Path, the movetree Action could not take place. Both paths must be included within the Trigger Set’s list of Active Path(s).
An Example: Archiving to Disk
Suppose you wanted to set up an archive-to-disk procedure that allowed users with appropriate permissions to move directory trees between a “live” path and a disk where archives were stored. The following steps would set up such a mechanism.
1.
Establish a Data Field called archive_restore which will provide users a check box in Image Info that allows them to move files between the archive and live areas:
2.
Add the new Data Field to a template and assign it to appropriate users.
This will allow personnel with appropriate User Permissions. An Image Info window displays, allowing archiving to disk and restoration. Once you’ve finished setting up the Trigger and Action, changing this piece of metadata allows them to move file trees between the archive and live path and vice versa.
3.
Set up a movetree Action providing both Movepath and Copypath locations.
4.
Create a Trigger Set using the archive_restore Data Field from Step 1. Be sure to set both the Copypath and Movepath locations as Active Paths.
This was the final step in the archive-to-disk set. This mechanism will always restore files to the place in the file system from which they were originally archived. If you would like to restore files to an alternative location, continue with the step below.
[optional] If you wish, you could set up a second Data Field and movetree Action that would move files from the archive location to a different path than the one where it originally resided. Continuing the example above—where we’ve just set changes to the archive_restore Data Field to move files between /space/movetree_live and /space/movetree_archived—we could create a second Data Field for the Image Info dialog, called archive_alt_restore, which could be set to move files between /space/movetree_archive and /space/movetree_alt_restore, in effect implementing a restore-to-alternative-path archive-to-disk. strategy that once again would maintain a version system within the /space/movetree_archive path.
More About Archiving to Disk
Many customers are choosing to maintain large archives on spinning disk rather than on removable media. The main driving factor in this trend is that disks have gotten considerably cheaper, tape libraries have not followed suit, and maintaining files you may reuse online gives easier, quicker access. This chapter describes some of the different strategies that can be employed to archive to disk using the tools included in Xinet software, and gives some ideas about why you might deploy them.
Before embarking on a system that involves only online archive, please be aware that you must independently ensure that your system is backed up. This means a regular process that either backs files up to removable media that is taken off site, or using a network system to back up to a different physical location. Regular tests to ensure that you can recover from your backups should be scheduled and performed.
The first step to implementing an archive to disk strategy is determining your company’s needs. Often companies will implement an archiving-to-disk regime that is an exact mirror of their previous archive-to-tape system, including all its limitations and quirks. While this is certainly possible, and sometimes appropriate, more often than not changes can be made to improve your overall efficiency. Before the different strategies can be evaluated, determine how much data needs to be “archived” each year, and how much of that data is ever accessed again after archive (either in whole or in part). Also evaluate how quickly and easily data on archive needs to be accessed.
The following topics are provided in this section:
Option 1: One big raid, one big Xinet volume
One option is to keep all work that might be reused in place in the production area. This means all the data on a fast RAID system and leaving it in place where it was created as long as it might be reused, either with one Xinet volume or several, depending on access and organization needs. This scenario is appropriate for customers that have data that is accessed often, but has a limited longevity. This system also assumes a skilled and disciplined production crew that can be relied on to manage the production files in a consistent and regimented manner.
An example of a customer that might deploy this sort of system would be a print-on-demand house servicing a vacation cruise line. Customers are likely to request data sheets about any of the cruises available, some of the data (such as boat descriptions) is relatively static and other data (such as special package pricing) changes a lot, but old data (such as packages available in 2094) does not need to be maintained. This sort of customer wants immediate access to all the data that is relevant on the server, and wants to be able to automatically feed the press with it as soon as a customer requests it. Refreshing RAID hardware on a regular schedule will be sufficient to solve problems with growth of files.
Option 2: One raid, one JBOD (or old raid), two or more Xinet volumes, operator control
In the case where production is controlled, and jobs get reused relatively often, it may make sense to have a “live” production area and an equally accessible “archive” area on slower, cheaper disk. This sort of setup is appropriate where a shop has an established workflow, and production staff are disciplined and used to controlling jobs from the desktop. In this setup, there is one (or more) production volumes, and one (or more) non-active, but equally accessible volumes. Migration between active and non-active production can be as simple as an operator selecting the job folder (with a Trigger button) when the job is “finished.” Migration could be more complicated, such as a Trigger that automatically moves jobs that have not been accessed in a certain amount of time, or a scheduled move where all jobs are cataloged by customer/month/year after they reach a certain age.
An example of a shop that might prefer this setup would be an active, long-established prepress shop with a large stable of customers. Much of the knowledge about customer behavior (for example, how much reuse is likely) lies in the minds of the production people, and this setup allows them to control the environment in what they find easy and familiar. The advantages of this setup are that a) most work that is likely to be reused is already in production, and b) less-likely-to-be-used jobs are still easily available without impacting everyday searching and navigation with their size.
If you’re interested in a system like this, be sure and read about Xinet movetree Action, see movetree.
Option 3: One production area, one strict archive area, strict migration policies
The scenario that most mimics traditional archive-to-tape is a setup, where the fast production area is “live” and the other disk is essentially a read only, retrieve-on-demand storage area. This is quite easy to implement with Xinet using Triggers and Actions. A Trigger to move to the archive area can be set up (and usually applied to a button with a Xinet Portal tag). It is trivial to restrict who can archive, and if you wish you can easily implement an approval process here. For example, most users’ requests to archive could simply execute a Trigger to send e-mail with a hot-link to a manager. The manager can choose to archive (which actually does the move) or not. The other volume has no access privileges (but is searchable and all previews, mviews, metadata, etc. are available). A button that triggers a copy back to the production volume (either to its original location or with a path mapping, via a custom Action) allows users to begin reusing the job immediately. Trigger Set Configuration and More Details about Triggers and Actions provide more information about Xinet Triggers and Actions.
Another option is to use the Xinet Portal Asset Fulfillment Request plug-in, which requires administrative approval for restoration requests before the files are placed in a retrieval area. More complex systems (for example, file versions that are archived again after being used again) are also possible to implement using Triggers and Actions if desired.
An example of a shop that would use this is one with a large, floating production staff. Only the person in charge of the job would be able to sign it off for archive (generally when it is approved) and a complete record of all produced jobs is maintained. Note that it is not difficult to implement Option 2 and 3 at the same time, with some users given large amounts of control and other users very little by using Xinet permissions and Permission Sets. You can also choose to make the migration to archive more programmatic (by date, access, etc.) rather than by human control if you wish. Retrieval from archive can also be controlled. For example, you might want to pass all non in-house images that are more than a year old by Legal to ensure that rights have been checked. The flexibility of metadata-driven Triggers and Actions allow you to make this as controlled as you wish.
Grooming
Note that most disk archives will tend to grow over time (the one exception we know of is a customer that prints funeral cards, as most people tend to die only once). There needs to be a program in place to consistently refresh hardware to keep up with this growth. Hardware historically has increased in storage size and processor power faster than the stored data grows. It is also good to consider that some data is no longer worth maintaining. For example, it is not necessary to maintain jobs for customers that are no longer in business. In other cases, a particular job may no longer be usable, but some elements (such as a product shot) may be re-usable. Often the most important issue is to ensure that you have sufficient organization or metadata to actually be able to find the correct job in your archives.