DROID 5.0
From DROID 5.0 Wiki
Welcome to the DROID 5.0. project development wiki. This wiki is for the DROID community to collaborate on developing the next version of DROID.
[edit] Who are 'we'?
We are The National Archives, the developers of the DROID tool. Having completed the proposals cataloging process we would like to encourage all stakeholders to participate in defining issues experienced with file formats.
[edit] What we would like you to do: Stage 2
Several proposals for DROID development highlight the need to improve identification of file formats in DROID. These are: Proposal 34, Proposal 57 and Proposal 58.
We would like to ask the community to help us in these ways:
- Tell us about any specific format issues you are aware of
Please fill out the issue catalogue below, in the same way we gathered proposals for specific issues with formats. Edit any existing issue if you have something to add.
- Send us some example files
We would also like example files that exhibit the issue where possible, as otherwise it may not be possible for us to replicate the issue and make fixes.
You can do this by sending a file to the following email address PRONOM@nationalarchives.gov.uk. Please make sure that any files you send are free of any copyright or sensitivity issues, and that you are happy for them to be placed into the public domain as part of our test suite. Alternatively, if you can supply us with a test file for internal development but don't want it to be released into the public domain, please state clearly in the email that this is the case.
If you have not used this wiki before, please read the User Guide before adding or changing anything. Don't worry if you have not edited a wiki before; it is really easy. Thank you for your input!
[edit] DROID 5.0: Format Issues Catalogue
| Issue 01: Multiple positive identifications of BIFF files | Issue 02: Too many external signatures for text files | Issue 03: Multiple positive identifications of TIFF files |
| Issue 04: Word 97/2003 identified as OLE2 compound document format | Issue 05: dat files identified as ESRI MapInfo Data File | Issue 06: Unidentified PDF files |
| Issue 07: Tentative Identification | Issue 08: New formats | Issue 09: RTF positive generic |
| Issue 10: DBF File Format | Issue 11: Placeholder | Issue 12: Placeholder |
[edit] DROID 5.0: Proposals Catalogue
The DROID 5.0. proposals catalogue can be found here. You can still contribute to the proposals, but we cannot now guarantee we will take account of new proposals, as we are in the middle of our development process. However, we are still very interested in any opinions or feedback you may have.
[edit] What happens next
- We are now in the process of developing the DROID 5 code and new signatures, which will run until June 2010. Main items scheduled for development are:
- Identification of container-formats e.g. Zip-based (Office 2007, OpenDocument), OLE2-based (Most Microsoft binary formats).
- Identification of files inside archival files (zip, gzip, tar (& possibly others time permitting).
- Identification of textual formats & encodings, and structured textual formats using a heuristic approach.
- Better result reporting (separating statements of fact from subjective judgements on quality) and more accurate signatures
- Interactive results tree - browse the results directly from the GUI.
- Enhanced filtering of results - filter on more than one metadata field, have different filters on different profiles.
- Modular code, separating concerns with interfaces and using Maven. The GUI and comamnd-line is entirely separate from the core.
- Greater usability in the GUI, unifying "standard" and "profiling" modes, with better user feedback.
- Standardised results outputs in various CSV and XML formats
- You can still contribute to the wiki, and we are very interested in any feedback you have. However, we cannot now guarantee we will take account of your contributions for this next round of development.