dagsby
Gatsby library for orchestrating running data pipelines across workers
Last updated 4 years ago by kylemathews .
MIT · Original npm · Tarball · package.json
$ npm install dagsby 
SYNC missed versions from official npm registry.

Dagsby

An experimental library for orchestrating job running and data processing across processes/machines within Gatsby.

Components

  • task/job defines inputs/outputs, dependencies, and code to execute
  • worker pool pool of node.js processes which can execute jobs
  • runner Uses N worker pools to run tasks.

Getting started

npm i dagsby

Start a worker pool:

node node_modules/dagsby/dist/worker-pool-server.js --numWorkers 4 --socketPort 9999 --httpPort 10020

Create a simple task in a test.js file and run it on the worker pool.

const dagsby = require(`dagsby`)

;(async () => {
  // Create our runner.
  const runner = await dagsby.createRunner({
    pools: [{ socketPort: 9999, httpPort: 10020 }],
  })

  // Create a simple task
  const task = await dagsby.createTask({
    func: args => `Hello ${args.name}!`,
    // Written using Arvo's schema language.
    argsSchema: [
      {
        name: `name`,
        type: `string`,
      },
    ],
  })

  // Setup the task on the worker pool(s).
  await runner.setupTask(task)

  // Run the task!
  const result = await runner.executeTask({ task, args: { name: `World` } })

  console.log(result)
})()

Let's try a more complex task where we specify a required file & add an NPM dependency.

First create a file called hello.txt with some text in it.

Then add this code to our test file after the first task.

const mySecondTask = await dagsby.createTask({
  func: (args, { files }) => {
    const fs = require(`fs`)
    const _ = require(`lodash`)
    const text = fs.readFileSync(files.text.localPath)
    const camelCase = _.camelCase(text)

    return `${args.preface} ${text} \n\n ${camelCase}`
  },
  argsSchema: [{ name: `preface`, type: `string` }],
  dependencies: {
    lodash: `latest`,
  },
  files: {
    text: {
      originPath: require(`path`).join(__dirname, `hello.txt`),
    },
  },
})
await runner.setupTask(mySecondTask)

const result2 = await runner.executeTask({
  task: mySecondTask,
  args: { preface: `yeeesss` },
})

console.log(result2)

TODOs

  • [ ] support (again) running multiple types of tasks in parallel.
  • [ ] support multiple pools in runners.

Current Tags

  • 0.0.4                                ...           latest (4 years ago)

4 Versions

  • 0.0.4                                ...           4 years ago
  • 0.0.3                                ...           4 years ago
  • 0.0.2                                ...           4 years ago
  • 0.0.1                                ...           4 years ago
Maintainers (1)
Downloads
Total 2
Today 0
This Week 0
This Month 1
Last Day 0
Last Week 0
Last Month 0
Dependencies (16)
Dev Dependencies (3)
Dependents (1)

© 2010 - cnpmjs.org x YWFE | Home | YWFE