annotate data/fileio/TextTest.h @ 1855:db489a1ece9b

Pull out text-document check; it's useful elsewhere
author Chris Cannam
date Mon, 11 May 2020 17:27:18 +0100
parents
children
rev   line source
Chris@1855 1 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*- vi:set ts=8 sts=4 sw=4: */
Chris@1855 2
Chris@1855 3 /*
Chris@1855 4 Sonic Visualiser
Chris@1855 5 An audio file viewer and annotation editor.
Chris@1855 6 Centre for Digital Music, Queen Mary, University of London.
Chris@1855 7
Chris@1855 8 This program is free software; you can redistribute it and/or
Chris@1855 9 modify it under the terms of the GNU General Public License as
Chris@1855 10 published by the Free Software Foundation; either version 2 of the
Chris@1855 11 License, or (at your option) any later version. See the file
Chris@1855 12 COPYING included with this distribution for more information.
Chris@1855 13 */
Chris@1855 14
Chris@1855 15 #ifndef SV_TEXT_TEST_H
Chris@1855 16 #define SV_TEXT_TEST_H
Chris@1855 17
Chris@1855 18 #include "data/fileio/FileSource.h"
Chris@1855 19
Chris@1855 20 class TextTest
Chris@1855 21 {
Chris@1855 22 public:
Chris@1855 23 /**
Chris@1855 24 * Return true if the source appears to point to a text format of
Chris@1855 25 * some kind (could be CSV, XML, RDF/Turtle etc).
Chris@1855 26 *
Chris@1855 27 * We apply two tests and report success if either succeeds:
Chris@1855 28 *
Chris@1855 29 * 1. The first few hundred bytes (where present) of the document
Chris@1855 30 * are valid UTF-8
Chris@1855 31 *
Chris@1855 32 * 2. The document starts with the text "<?xml" when opened using
Chris@1855 33 * QXmlInputSource (which guesses its text encoding)
Chris@1855 34 *
Chris@1855 35 * So we only accept non-UTF-8 encodings where they also happen to
Chris@1855 36 * be XML documents.
Chris@1855 37 */
Chris@1855 38 static bool isApparentTextDocument(FileSource);
Chris@1855 39 };
Chris@1855 40
Chris@1855 41 #endif