Extending the library

To extend the library with new formats (either input or output) a developer only needs to subclass the specified class (for reading, hepdata_converter.parsers.Parser, for writing, hepdata_converter.writers.Writer), and make sure that files containing these implementations are respectively in hepdata_converter.parsers or hepdata_converter.writers.

Creating a new Parser

In order to create a new Parser you need to create a class inheriting the Parser class and override the def parse(self, data_in, *args, **kwargs): abstract method. If you’re trying to extend the library you should put the file containing the new Parser in the hepdata_converter/parsers directory. The name of the class is important: the new parser will be available by this (case-insensitive) name. If your goal is a simple hack, then the package containing the new parser class can be wherever, but the parser class has to be imported before using the hepdata_converter.convert function.

An example is given below:

from hepdata_converter.common import Option
from hepdata_converter.parsers import Parser, ParsedData


class FOO(Parser):
    help = 'FOO Parser help text displayed in CLI after typing hepdata-converter --help'

    @classmethod
    def options(cls):
        options = Parser.options()
        # add foo_option which is bool and has default value of True
        # it will be automatically added as named argument to __init__ function
        # as foo_option (code below will work):
        # foo = FOO(foo_option=False)
        #
        # additionally it will be accessible inside the class instance as
        # self.foo_option

        options['foo_option'] = Option('foo-option', default=True, type=bool, required=False,
                                       help='Description of the option printed in CLI')

    def parse(self, data_in, *args, **kwargs):
        # WARNING it is developers responsibility to be able to handle
        # data_in regardless whether it is string (path) or filelike
        # object

        # list of hepdata_converter.Table objects
        tables = []
        # dictionary corresponding to submission.yaml general element (comment, license - not table data)
        metadata = {}

        # ... parse data_in into metadata and tables

        return ParsedData(metadata, tables)

If this class is put in (e.g.) hepdata_converter/parsers/foo_parser.py then it could be accessed from Python code as:

import hepdata_converter

hepdata_converter.convert('/path/to/input', '/path/to/output',
                          options={'input_format': 'foo'})

It could also be accessed from the CLI:

$ hepdata-converter --input-format foo /path/to/input /path/to/output

WARNING: it is the developer’s responsibility to be able to handle data_in in def parse(self, data_in, *args, **kwargs): regardless whether it is a string (path) or a file-like object.

Creating a new Writer

Creation of a new Writer is similar to creating a new Parser (see above), but for the sake of completeness the full description is provided below. In order to create a new Writer you need to create a class inheriting the Writer class and override the def write(self, data_in, data_out, *args, **kwargs): abstract method. If you’re trying to extend the library you should put the file containing the new Parser in the hepdata_converter/writers directory. The name of the class is important: the new writer will be available by this (case-insensitive) name. If your goal is a simple hack, then the package containing the new writer class can be wherever, but the writer class has to be imported before using the hepdata_converter.convert function.

An example is given below:

from hepdata_converter.common import Option
from hepdata_converter.writers import Writer


class FOO(Writer):
    help = 'FOO Writer help text displayed in CLI after typing hepdata-converter --help'

    @classmethod
    def options(cls):
        options = Writer.options()
        # add foo_option which is bool and has default value of True
        # it will be automatically added as named argument to __init__ function
        # as foo_option (code below will work):
        # foo = FOO(foo_option=False)
        #
        # additionally it will be accessible inside the class instance as
        # self.foo_option

        options['foo_option'] = Option('foo-option', default=True, type=bool, required=False,
                                       help='Description of the option printed in CLI')

    def write(self, data_in, data_out, *args, **kwargs):
        # data_in is directly passed from Parser.parse method
        # and is instance of ParsedData

        # WARNING it is developers responsibility to be able to handle
        # data_out regardless whether it is string (path) or filelike
        # object

        pass

If this class is put in (e.g.) hepdata_converter/writers/foo_writer.py then it could be accessed from Python code as:

import hepdata_converter

hepdata_converter.convert('/path/to/input', '/path/to/output',
                          options={'output_format': 'foo'})

It could also be accessed from the CLI:

hepdata-converter --output-format foo /path/to/input /path/to/output

WARNING: it is the developer’s responsibility to be able to handle data_out in def write(self, data_in, data_out, *args, **kwargs): regardless whether it is a string (path) or a file-like object.