The 
University of Arizona

Technical Support of the Personal Information Sweep (IS-G302)

The Personal Information Sweep is designed to guide a computer user through the process largely without assistance. There are, however, instances where technical support may be necessary or desirable. This guideline identifies many of those areas. Each area is presented by reference to the applicable step or steps of the Personal Information Sweep.

Brief Flash and PowerPoint presentations (with notes) for IT staff are available on the Information Security web site. Presentations for users are also available.

The documentation for Spider3 for Windows provides additional information on the features and capabilities of Cornell Spider.

Steps 2 and 8: Deleting Files

Because of the potential for user error, the use of a secure deletion method is not required. Instructions for installing and using free file deletion tools are available in the File Deletion Guideline.

IT staff may wish to schedule a nightly Windows disk defragmentation to help eliminate erased data.

Step 3: Archiving CDs, DVDs and Flash Drives

Users can elect to move personal information to a CD, DVD, or flash drive. Units should develop procedures for archiving CDs, DVDs and flash drives storing personal information.

Steps 3 and 8: Encryption

IT staff should coordinate several aspects of encryption:

  • selection of encryption products
  • key management procedures
  • data recovery procedures
  • backup of encrypted files

Refer to the Encryption Guideline for more information.

Step 4: Software Installation for Windows Clients

The Personal Information Sweep directs users to scan their computers with Cornell Spider. A UA-specific configuration file has been developed to enable scanning for Arizona driver’s license numbers and to eliminate paths and file extensions that have proved to produce false positives. InfoSec does not recommend scanning without the configuration file.

Users are given four options for installation. One relates to system requirements; the others relate to the level of support provided by IT staff. IT staff can elect to:

  • install cornspider.exe, which includes both Cornell Spider and the configuration file. The registry settings are installed to the HKEY_CURRENT_USER registry tree, so this option is only valid if you are installing under the user that will be running the program.
  • install Spider3 only: MSI file (from Cornell web site) and instruct users to select "If your IT staff installed Spider but NOT the UA-specific configuration, click here to continue" when they get to step 4 of the PI Sweep procedure.
  • take no action (user installs both Spider and the configuration file)

For Windows users without administrative privileges, IT staff will need to assist with installation of:

  • .NET 2.0 or later, if not previously installed
  • Cornell Spider

Group Policy or scripts can be used to install Cornell Spider and .NET framework software to clients. In addition, Group Policy Preferences can be used to configure Cornell Spider, as a replacement for downloading the configuration file in Step 4. Information on using Group Policy is available at the Microsoft TechNet website:

If you experience difficulty in installing the combination Spider/configuration file (cornspider.exe), try one or both of the following:

  • Install the Spider3 MSI file (from Cornell web site), then follow the instructions in Step 4 to install the configuration file.
  • For Windows XP or earlier, check for and install .NET 2.0 or later before installing cornspider.exe:
    • Go to http://www.update.microsoft.com.
    • Click Custom.
    • The update site will analyze your system. When the list of software appears, in the left-hand column, click Software, Optional.
    • Check to see if Microsoft .NET Framework version 2.0 is in the list. If it is, select it, and click Review and Install Updates.
    • Click Install Updates.
    • When the download and installation is complete, restart your computer.
    • Repeat the instructions for installing Spider.
Steps 4-7: Remote Installation and Scanning

Spider for Linux can be run on a Linux machine and Spider for Windows can be run on a Windows machine, mounting workstations via NFS or Samba, and scanning them remotely. See, e.g., the approach taken at UC Davis.

Logs should be delivered to users for review.

Step 4: Software Installation for Macintosh Clients

An application combining Spider and the UA-specific configuration file has been developed for Macintosh users. A UA-specific configuration file enables scanning for Arizona driver’s license numbers and eliminates paths and file extensions that have proved to produce false positives. While the configuration file is not as effective at eliminating false positives for Macintosh clients as it is for Windows, InfoSec does not recommend scanning without the configuration file.

The Spider/configuration application is available at:
www.security.arizona.edu/files/Spider_OSX_at_UA.dmg

Step 4: Software Installation for Linux Clients

The version of Spider available in Step 4 of the Personal Information Sweep has several advantages over the version available from Cornell. Cornell's version is not portable (without porting) and does not compile on Linux/i386 or MacOS/Intel without editing Makefile. By contrast, the UA version is portable and version 1.1.1 compiles on Linux/i386/x86_64/PPC, Solaris/Sparc, HP-UX/PA-RISC, and MacOS X/PPC/x86 with no modification.  Spider is available precompiled or for compiling from source.

Development versions of Spider are available at git://git.uits.arizona.edu/spider.git.  To obtain the beta and stable versions:

  • Type git clone git://git.uits.arizona.edu/spider.git
  • For the beta version, type git checkout --track -b test origin/test; for the stable version, type git checkout --track -b release origin/release
Steps 4-8: Dual Boot Operating Systems and Virtual Machines

Users of dual boot operating systems should scan each operating system with the appropriate versions of Spider.

Intel Mac users with Parallels Desktop or VMware Fusion who are technically inclined or who can enlist support from local IT staff may install Spider for Windows into a Windows virtual machine and then mount their Mac disk volume to be searched and cleaned from inside their Windows virtual machine. Consult the documentation for Parallels Desktop or VMware Fusion documentation for disk mounting instructions:

Step 6: Scanning Flash Drives and Other Portable Media

The default configuration of Spider scans all connected drives, including connected flash drives. Step 6 includes instructions for separately scanning flash drives or other external media.

Step 6: Scanning Mapped Network Drives

By default, Spider scans a mapped network drive just as it would a local drive. It will be forced to skip any file it does not have the necessary rights to open, or any file that is open and locked by another application.

Shared folders present some technical challenges. IT staff should consider implementing one or more of the following options:

  • assigning responsibility for scanning shared folders
  • scheduling users' scans of shared folders at different times
  • reconfiguring Spider to limit scanning, scanning mapped drives separately and delivering log files to users for review (users must be involved in determining whether to retain identified personal information)
    • To exclude all mapped network drives:
      • Go to Spider Configure menu > Settings > Scan Options tab > Disk tab
      • Deselect Network under Drive Types
      • Click File Extension Management button
      • Delete all file extensions under File Extensions to Scan
      • Click Save
      • Run Spider to scan local drive
    • To scan specific paths on the mapped drive:
      • Go to Spider Configure menu > Settings > Scan Options tab > Disk tab
      • Select Network under Drive Types and deselect all others
      • If you haven't already, click File Extension Management button and delete all file extensions under File Extensions to Scan
      • Click Start Directory button
      • Navigate to the folder you want to use as the starting point for the scan
      • Click OK
      • Click Save
      • Run Spider to scan designated paths

NOTE: Info Sec recommends that users either (1) scan all drives in a single pass, or (2) follow the instructions above to scan separately, in two passes, local drives and specific paths of shared drives. The UA-specific configuration does not readily allow users to pick and choose drives and paths without substantial understanding and re-configuration.

Step 6: Scanning Databases

When Spider attempts to scan the data in an Access database, it recognizes where there are linked tables in another database and attempts to scan the other database as well. Generally, a password will be requested when Spider begins scanning. Users should cancel the dialog box. This can require extra attention during the scanning process, particularly when scanning shared folders or computers with multiple Access databases. Users who elect to allow Spider to run after regular work hours may find that its progress is delayed by the appearance of a dialog box.

If, on the other hand, credentials are stored by the application, Spider may attempt to scan the linked database.

Scanning of databases is likely to produce false positives. Users can reduce the likelihood of false positives by completing Steps 2 and 3 for databases before running Spider.

Step 6: Running Spider on Macintosh OS X 10.3 "Panther"

If Spider installs but does not start on Panther, the reason may be a missing library.

To correct this problem, locate and read the crash dump to determine what is missing:

  • If running as root: more /Library/Logs/CrashReporter/Spider.crash.log
  • If running as user: more /Users/username/Library/Logs/CrashReporter/Spider.crash.log

Look for a line similar to: Can’t open library: /usr/lib/libbz2.1.0.dylib (missing)

If /usr/bin/make is available, download and install MacPorts to install the libraries under /opt/local/lib:

  • Go to http://www.macports.org/
  • At the top left corner, click Install Instructions.
  • Select Panther to download MacPorts.
  • Double-click on the disc image once downloaded.
  • Follow the instructions to confirm installation flow.
  • Type cd /opt/local/bin ./port –v selfupdate ./port install bzip2. The libraries are installed under /opt/local/lib.
  • Type cd /usr/lib
  • Type mv libbz2.a libbz2.a.org
  • Type ln –s /opt/local/lib/libbz2.dylib .
  • Type ln –s /opt/local/lib/libbz2.1.dylib .
  • Type ln –s /opt/local/lib/libbz2.1.0.dylib .
  • Type ln –s /opt/local/lib/libbz2.1.4.dylib .
Step 8: Opening Files Without Default Applications

In Windows, files open using default applications. Step 8 includes instructions for using Wordpad if no default application is available.

Step 8: False Positives

Cornell Spider may produce false positives, particularly Spider for Macintosh, which is currently available in beta. Users can reduce the likelihood of false positives by completing Steps 2 and 3 to the best of their ability before running Spider.

Contact the University Information Security Officer with any questions or suggestions regarding the configuration.

Step 8: Accessing Files

Some users will very likely ask for assistance in accessing files listed in the Spider log.

In Windows, users can access files by clicking on the Run button or a link in the Spider Log Viewer or text file. The log file for Macintosh and Linux is more difficult to use. Users must locate the files using the path in the log file.

Step 9: Compliance with Applicable Security Standards

IT staff must ensure that any device on which personal information will be stored meets the Minimum Security for Networked Devices Standard and the Server Security Standard, as applicable.

The Implementation Guideline offers additional information for implementing the Minimum Security for Networked Devices Standard.

For servers, a Server Baseline Review Template is available to assist with implementation of both standards.

Step 10: Registration of Computers Storing Personal Information

Only one registration of any device storing personal information is necessary. IT staff should coordinate registration of file shares storing personal information to avoid duplication.


Italicized terms used in this guideline are defined in the Information Security Terms Guideline.

Related Guidance