feature summary

core4 facilitates day-to-day tasks of data engineers and data scientists. The main objective of data engineers is to effectively integrate and deliver data for analysis and insights. Meanwhile the goal of data scientists is to provide relevant insights to business users.

core4s target is to support this community with technical bits and pieces in order to let them focus on their main objectives: data integration and insight creation.

The following chart outlines main components of the core4 framework. It consists of three main compartments:

  • the core4 backend (blue)
  • the core4 frontend (green)
  • general purpose components (grey)
core4 component overview

With the bundling of the following features these components reduce time to market for data management, insight extraction and its business application. As a consequence results to the end user are delivered within hours or days - not weeks or months.

  • automated job execution - simple schema to automate, monitor and control jobs in a reliable, yet fault tolerant way
  • scale up (and scale down) - parallel computing mechanic from multi-core laptop environments to a multi-node cluster setup
  • project architecture - a unified scheme to package and manage multiple projects, accounts, business domains, or tenants
  • web service API - unified interface to access and distribute insights and data to exchange data with other systems inbound and outbount including web application frameworks
  • mini applications - extension to the web service API to deliver simple applications based on HTML5 and Jinja2
  • unified authorisation and access management across jobs, web service API, databases and applications
  • configuration management - a flexible configuration system which allows an effective transition from analysis and development, through testing and staging into production environments; furthermore core4 configuration mechanics pierce the boundaries between environments and provide easy, yet secure access to the most relevant ingredient of data engineers’ and data scientists’ daily business: access real (production) data
  • central logging - of all core4 activities, actions and events; this facilitates troubleshooting, bug fixing and “in performance” analyses in a distributed runtime environment
  • simple deployment and rollout mechanic to support the not so technical data scientist and data engineer in productionising all deliverables, results and tools