12 rules of data-oriented programming

De
Aller à la navigation Aller à la recherche

Maintainability

  1. Minimize software dependencies and avoid heavy or unusual frameworks
  2. Never use hard-coded values: use command-line parameters, environment variables and configuration files instead
  3. Propose a verbose/debug mode (logging read and written files, calculation steps)
  4. Create unit tests through development
  5. Use versioning (git)

Usability

  1. Provide documentation for the software: a README file and an usage message; if applicable, also provide an example dataset
  2. Specify installation instructions with an explicit and exhaustive list of requirements
  3. make another person in the team install and test the software
  4. Only use standard file formats: CSV/TSV, XML, JSON, YAML
  5. For web application, provide an API

Efficiency

  1. Use memory parsimoniously (if compiled, monitor with a tool like Valgrind)
  2. Add built-in support for compressed data (gzipped)

Recommended software

  • Data: SQLite, PostgreSQL, Neo4j, HDF5
  • Languages: C, Python, R, bash, Tcl, PHP