Skip to content

Move core job management codes to controller in go #462

@typhoonzero

Description

@typhoonzero

According to previous discussions, we'd like to put job management, autoscaling, job priority scheduler to "controller", move the python implementation to go.

This change can make the website slimmer, also make the user management pluggable. Modules we should have after this changes:

  • Static web pages for general information.
  • Simple account management.
  • paddlectl simply cache k8s keys and storage service keys (like ceph) and talk directly to k8s api-server using "TrainingJob" resource.
  • controller parse "TrainingJob" resource to paddle job, including master, pserver and trainer.
  • autoscaler runs in background and scale the current jobs in cluster.
  • scheduler determine how much each job can consume using some GPU priority algorithm.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions