Administration Guide : Importing Data into the Xinet Database : Import Examples

Import Examples
Following are two examples of imports, with the use of flags explained. For the most up-to-date information about the import command options, rely on its online usage summary. Type the command without providing any arguments:
Unix: /usr/etc/venture/bin/import
Windows: C:\Program Files\Xinet\Venture\bin\import
Case one: files are moved onto new server before running import
In this example, we copy a folder called Images from one server to another. On the original server, Images was in /raid/2014/. On the new server, it will be on
/Volumes/Imports/OSX/CustomerData. A database tracked the location of files and metadata on the original server. The goal, now, is to copy the file to the new server and import the metadata into Xinet, associating it with the correct files:
1.
2.
Here are three example comma-delimited records from an export file:
start
/raid/2014/Images,textshort1,tttteeeeeexxxxxxtttttloooonnngggg1,
3/15/2014,2014/3/15 11:11:11 AM,25,251,250001,25.52,1,
Bylinetest1,25
/raid/2014/Images/Dalim Demo,textshort2,tttteeeeeexxxxxxttttt-
loooonnngggg2,3/15/2014,2014/3/15 11:11:11 AM,25,252,250002,25.52,1,Bylinetest2,25
/raid/2014/Images/Dalim Demo/4A.dct,textshort3,tttteeeeee-
xxxxxxtttttloooonnngggg3,3/15/2014,2014/3/15 11:11:11 AM,25,253,250003,25.52,1,Bylinetest3,25

The export file has the path to the files and all associated metadata. It is usually important that the export file provides paths to the files
as they were on the old server. Typically, the imported files will have the same or a similar folder structure. The metadata in the export file will be matched up with fields created in the Xinet database.
3.
Name Type
textshort (16 chars)
textlong (60 chars)
date4 (the 4th date listed in Administration view)
date18
(the 18th)
int1 (1 byte integer)
int2 (2 byte integer)
int4 (4 byte integer)
float (float)
boolean (boolean)
Byline (IPTC text)
Urgency (IPTC tiny int)
4.
Construct your import command. (On Unix systems, the import program lives in /usr/etc/venture/bin; on Windows systems, it’s in C:\Program Files\Xinet\Venture\bin).
tip: You may want to make a script that contains your test import command. Typically one has to run the command a few times to work out syntax errors. Putting the command in a file and running the file as a script allows you to edit the command more easily than retyping everything at the command-line prompt.
– Begin the file with #!/bin/csh -f.
– Then, write the import command
– Set the file’s permissions so that the file is executable. Let’s say the file were named cmd:
# chmod cmd 777
– Run cmd, directing output to a log file so you can see any debugging messages:
./cmd >& log &
An import command to handle the example export file above would look like this.
/usr/etc/venture/bin/import "export.txt" -onlyexisting -s2c
-r0A -g73746172740a -ipath_to_file -ttextshort -ttextlong
"-ddate4’MM/DD/YYYY’:def=09/20/1972" "-ddate18’YYYY/MM/DD hh:mm:ss AA’:def=1972/09/20 11:11:11 PM" -uint1 -uint2 -uint4
-ffloat -uboolean -tByline -uUrgency -b"/Volumes/Imports/OSX/CustomerData%10c1%"
where export.txt is the name of the export file.
Here’s the same command, separated flag-by-flag, for easier inspection:
/usr/etc/venture/bin/import "export.txt"
-onlyexisting
-s2c
-r0A
-g73746172740a
-ipath_to_file
-ttextshort
-ttextlong
"-ddate4’MM/DD/YYYY’:def=09/20/1972"
"-ddate18’YYYY/MM/DD hh:mm:ss AA’:def=1972/09/20 11:11:11 PM"
-uint1
-uint2
-uint4
-ffloat
-uboolean
-tByline
-uUrgency
-b"/Volumes/Imports/OSX/CustomerData%10c1%"
In an iterative process, adjust arguments to the import command. Here is what each argument in our example does:
Argument Explanation
“export.txt” In our example, the name of the file containing the exported database. In real life, you can name the file anything you choose.
-onlyexisting The import will only match files that have records in the database already. If the file can't be found, no new (or “virtual”) entry is made.
Also see the -onlynew flag, described in Case 2.
If neither flag is used, the import program will match with existing records if they can be found, and make new records if they cannot be found.
-s2c Hex value of the separator between each column in a record. In this case, a comma is the separator.

On Unix systems, the command od -x will provide this value for you. See od(1) for details. On Windows systems, use a Hex editor to determine the value.
-r0A The hex value of the separator between each record. Typically that’s a Return, whose hex representation is 0A.
-g73746172740a The last string (in hex) before the start of the data. In the export.txt file, the string is start plus a Return (thus the final 0a).
You may alternatively use a -o flag, which tells, in hexadecimal bytes, how much of the file to skip.
The following flags deal with the actual data in the export file. The first argument deals with the first column, the second one the second column, and so on.
 
-ipath_to_file The -i argument means ignore the first column, for example., the path in our example. While it may seem odd to ignore the path, the path is really dealt with by the -b argument, as explained below. The -i argument can be used to ignore any field. The string following the -i (path_to_file in this case) is purely optional. It’s a good way, however, to note what is being skipped.
-ttextshort The -t argument identifies a text-field name, as it exists in the Xinet database. In this example the field’s name is textshort.
-ttextlong Another text field (also previously established in Xinet) called textlong.
-ddate4’MM/DD/YYYY’:def=09/20/1972
Delineates a date field called date4 that uses the format in the quotes. The def= value provides the default to be used when the exported data doesn’t contain a value. A default is optional.
-ddate18’YYYY/MM/DD hh:mm:ss AA’:def=1972/09/20 11:11:11 PM
Another date field. The format of all dates follows these rules: Y = year M = month D = day h = hour m = minute
s = second AA=am or pm.
As an example, consider the date December 1st, 2009. For a format like 12/01/2009, the date argument should be MM/DD/YYYY. For a format like 12/01/09, the date argument should be MM/DD/YY. For a format like 09/12/01, the date argument should be YY/MM/DD.
The format of the date in the export file does not have to match the format of the date field in Xinet. Upon import, the dates are converted to Unix time. The display format of the date field in Xinet determines what the date will look like when actually seen in Xinet.
-uint1 Integer field
-uint2 Integer field
-uint4 Integer field
-ffloat Float field
-uboolean Boolean field
-tByline A text field that happens to be an IPTC field.
-uUrgency An integer field that happens to be an IPTC field
-b”/Volumes/Imports/OSX/CustomerData%10c1%
The goal of the -b argument is to construct the paths to all the files being imported. Without knowing the path to the file, the import command will not attach metadata.
Part of the argument will be a constant. All the files will be within a certain area on the server, say Volumes/Imports/OSX/CustomerData. If that part of the path will be common to all files, it should be the first part of the argument.
Another part of the argument will vary from file to file. Files will be in different directories and have different names. The location and names of the files are in the export file in fields, and the import command needs to find them as it runs through each record. To find this information, some variables are used.
The two % signs surround a variable. A number within those signs stands for a field number in the export file. 1 stands for the 1st field, 2 for the second field, and so on. Thus, %5% refers to the fifth field, and the value of that field will be substituted in place of the %5% If field 5 has the path to the file and its name, this would be the field number to use.
Note for Windows
When using the
import command interactively, for example., on the command-line, use only one percent sign on each side of the field number, e.g., %5%. When scripting a .bat file, however, use double percent signs on each side of the field, e.g., %%5%%.
A number followed by a c means to “clear” that number of characters from the string with which the field number was associated. For example, %10c5% means to substitute in the value of the fifth field, but remove the first ten characters of the string. If the fifth field's value is /raid/data/Customers/, then %10c5% would evaluate to /Customers/.
An m means to convert the path format of the string from the Macintosh style to the Unix style. The Macintosh style use colons to separate folders while Unix uses forward slashes. The m converts to the Unix style. An example of using it in conjunction with the clear flag would be: %10cm5%.
There is no need to convert from Windows style paths (backslashes) to Unix. These are converted automatically.
Multiple fields can be specified within the two % signs. For instance, %5%%6%%7% is a valid argument. This can be useful if the path to the file is not in one field but split into several fields. The path can be “constructed” in this manner.
Now let’s look at our example.
Remember that we had a folder called Images which we moved to the new server. We placed it in /Volumes/Imports/OSX/CustomerData, the new home for Images. Xinet copied all the file information into its database at the time the files were moved onto the server. The old server had the images in /raid/2014/Images. An example record from the export file would show:
/raid/2014/Images/Dalim Demo/4A.dct
Each record in the export has a path similar to that. On the new server, that file’s location is:
/Volumes/Imports/OSX/CustomerData/Images/Dalim Demo/4A.dct
The -b argument begins with the part of the path that will be common to all the imported files:
/Volumes/Imports/OSX/CustomerData/
The second part of the location, Images/Dalim Demo/4A.dct, is specified by %10c1%. Here, the final 1 is a variable representing the first field in the export file. For this sample record, the value is /raid/2014/Images/Dalim Demo/4A.dct. The 10c tells the import program to skip the first 10 characters in that string. So /raid/2014 is stripped out, leaving /Images/Dalim Demo/4A.dct. The program places this string after /Volumes/Imports/OSX/CustomerData, giving us the grand result: /Volumes/Imports/OSX/CustomerData/Images/Dalim Demo/4A.dct
Now that the file is “matched,” its metadata can also be read in. The same process happens for each record in the export file.
5.
Run the import command.
Make sure that users aren’t working on the files at the time of the import. The dblogd daemon does not need to be stopped.
Case two: files are not moved onto new server
In this case, the images are not actually moved to the server. Instead, they will appear as archived files with the nearline tag in Xinet. Users will be able to see the file names, characteristics, previews, and metadata. Let’s say that the old server had the Images directory located at /Volumes/Test/OPI Testing. On the new server, we want the “archived files” to appear under /raid/CustomerData/Archives. Here is how to proceed:
1.
/Volumes/Test/OPI Testing/Images,8001,7001,6001,11/11/11 11:11:11 AM,22/22/22 22:22:22 AM,33/33/33 33:33:33 AM,..CT,8BIM,
textshort1tttteeeeeexxxxxxtttttloooonnngggg1,03/15/2014,2014/03/15 11:11:11 AM,25,251,250001,25.52,1,Bylinetest1,25
/Volumes/Test/OPI Testing/Images/alias EPS,8002,7002,6002,11/11/11 11:11:11 AM,22/22/22 22:22:22 AM,33/33/33 33:33:33 AM,..CT,8BIM,textshort2,tttteeeeeexxxxxxttttt-
loooonnngggg2,03/15/2014,2014/03/15 11:11:11 AM,25,252,250002,25.52,1,Bylinetest2,25
/Volumes/Test/OPI Testing/Images/alias EPS/cmykepsbinarywithpath,8003,7003,6003,11/11/11 11:11:11 AM,22/22/22 22:22:22 AM,33/33/33 33:33:33 AM,..CT,8BIM,textshort3,tttteeeeee-
xxxxxxtttttloooonnngggg3,03/15/2014,2014/03/15 11:11:11 AM,25,253,250003,25.52,1,Bylinetest3,25
2.
3.
Construct the import call or execution, using the appropriate arguments.The arguments are largely the same as in Case 1, but there are new flags you might want to use.
For this second example, we use:
Argument Explanation
-z (file size)
-w (image width)
-h (image height)
-T (file mac type)
-C (file mac creator)
-onlynew If this flag is used, the import program will only match files that did not have records in the database. This will make all new (or “virtual”) records.
Also see the onlyexisting flag, described in Case 1.
If neither flag is used, the import program will match with existing records if they can be found, and make new records if they cannot be found.
-m <date definition string>’[:def=<default value>] (file modified date)
-c <date definition string>’[:def=<default value>] (file creation date)
-a <date definition string>’[:def=<default value>] (file archive date)
Here’s a sample import command, where the import file is called export.txt:
/usr/etc/venture/bin/import "export.txt" -onlynew -s2c -r0A
-g73746172740a -ipath -zFileSize -wWidth -hHeight -mModifyDate
-cCreateDate -aArchiveDate -TType -CCreator -ttextshort -ttextlong "-ddate4’MM/YY/YYYY’:def=09/20/1972" "-ddate18’YYYY/MM/DD hh:mm:ss AA’:def=1972/09/20 11:11:11 PM" -uint1 -uint2 -uint4
-ffloat -uboolean -tByline -uUrgency -b"/raid/CustomerData/Archives%31cl%"
Here’s the same command, separated flag-by-flag, for easier inspection:
/usr/etc/venture/bin/import "export.txt"
-onlynew
-s2c
-r0A
-g73746172740a
-ipath
-zFileSize
-wWidth
-hHeight
-mModifyDate
-cCreateDate
-aArchiveDate
-TType
-CCreator
-ttextshort
-ttextlong
"-ddate4’MM/YY/YYYY’:def=09/20/1972"
"-ddate18’YYYY/MM/DD hh:mm:ss AA’:def=1972/09/20 11:11:11 PM"
-uint1
-uint2
-uint4
-ffloat
-uboolean
-tByline
-uUrgency
-b"/raid/CustomerData/Archives%31cl%"
4.
Run the import command.
When the command has been run, the files will appear as “Near Line” files in Xinet.
In the database, each file will have an entry in the archivefile table. There will also be a new archivemedia entry with a medianame of IMPORTED. The archive file entries will use the mediaid for the IMPORTED entry.
An archive media other than IMPORTED can be specified with the
-mMediaName flag. The only criteria is that the archivemedia record for the MediaName already exist in Xinet.
The archive state (whether a piece of media is online, nearline, or offline) can be changed with the venturelog utility, or by altering the archivemedia table in the mysql console.
Additional notes
There is a -preflight flag that will parse the export file but not take any action. Xinet highly recommends using this flag before doing a real import.
In any case, it is fine to rerun the import command. Existing metadata values will be overwritten.
There is a -D flag for debugging, which is useful if the import command fails to find metadata fields or files. If a file can’t be found, the debugging file will show something like:
File /Volumes/California/Imports/OSX/CustomerData/Images/file_list is not in the database, skipping record.
There is an -ignoredataerrors that will proceed with the import regardless of errors.
There is an -ignoremissingfields that will also proceed with the import regardless of errors.
If you run import without any arguments, you’ll get a usage summary.
For example, here is an integer field called Days_To_Complete with a default value of “30”: -uDays_To_Complete:def=30.
The import command expects the language character set of the data being imported to be the same character set used by the Xinet database. By default, Xinet uses Latin1, which is sufficient for most European language strings. If the export file is also in the Latin1 character set, then the import command does not need to do anything special to handle the special characters. Likewise, a database using SJIS will have no trouble importing an export made in SJIS.
If a field in the export file is in a different character set that the one used by the database (which is UTF8 by default) then the import command should note, per field, what set was used while making the export using the charset flag. For example:
/usr/etc/venture/bin/import BAHAG_Beschreibung.txt -onlyexisting -s09 -r0A -g232073746172743A0A "-iUnixPathName" "-iMacName" -TType -CCreator "-mModification Date'YYYY-MM-DD hh:mm:ss'" "-cCreation Date'YYYY-MM-DD hh:mm:ss'" "-tBAHAG_Beschreibung:charset=latin1" "-b%1%"
If text data fields in the export file has values inUTF8 characters and the database does not use UTF8, then the import argument should read:
-t (Field):charset=utf8
For example, if the data field is called people, the following import argument is used:
-tpeople:charset=utf8
This needs to be done for each data field. Character-set arguments for
the Xinet’s
import command include:
latin1 (Standard Roman character set; common on PCs)
utf8 (Unicode; OS X native encoding)
sjis (Shift-JIS Japanese encoding)
mac (Legacy OS 9 native Macintosh Roman encoding)
Note that the file on the server will be encoded using Xinet’s format. This is not the same encoding used on the Mac (which will be Latin1, MAC, UTF8, etc). For special characters, that means that the file name will not look the same if the Unix name and the Mac name are compared. Xinet adds both names to the database record, but when importing it matches against the Macintosh name.
The -A flag is used to import archive information about files. It is useful for migrations from one server to another.Typically, you would export this information from a Xinet server into a file using the export command. The export command writes all the required information in the required format. If that data needs to be preserved on the new server, simply adding the -A flag to the import command will be enough.
Alternately, one can add archive information by hand to the file you plan to use with the import command (for example., the “exported” file).
You should add information in two places:
1.
There should be one entry for each tape:
where:
online, nearline, and offline require values of either 0 or 1.
mediaid requires a number
tapename requires a name
the delimiter must be a <tab>
there must be a final <tab>
2.
This is a single field, separated from other fields by whatever delimiter has been specified. (In our example below, we will assume that is a <tab>.)
Within this field, the information is split up by commas:
<archive date in unix time>, <media id>, <archive number>, <size>, <thumbnail offset>, <thumbnail size>, 0x00<tab>
The final <tab> in this example marks the end of this field. The final 0x00 is necessary at the end of the field.
A single file may have been archived more than once. After a “;” the archive information can be repeated for the file as many times as necessary:
<archive date in unix time>, <media id>, <archive number>, <size>, <thumbnail offset>, <thumbnail size>, 0x00;<archive date in unix time>, <media id>, <archive number>, <size>, <thumbnail offset>, <thumbnail size>, 0x00...<tab>