Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

nni PAI training service user experience issues #2567

Closed
1 of 5 tasks
chicm-ms opened this issue Jun 17, 2020 · 0 comments
Closed
1 of 5 tasks

nni PAI training service user experience issues #2567

chicm-ms opened this issue Jun 17, 2020 · 0 comments

Comments

@chicm-ms
Copy link
Contributor

chicm-ms commented Jun 17, 2020

  • mount unstable : file missing on storage (low priority cluster)

  • nni error, mount files can not be found occasionally (179)

  • trial jobs long waiting time

  • azure storage support documentation (no azure storage support on 179)

  • metrics problem: [6/17/2020, 9:33:12 AM] ERROR [ 'SyntaxError: Unexpected token f in JSON at position 92\n at JSON.parse ()\n at NNIDataStore.storeMetricData (/home/quzha/.local/nni/core/nniDataStore.js:106:30)\n at NNIManager.onTrialJobMetrics (/home/quzha/.local/nni/core/nnimanager.js:518:30)\n at EventEmitter.NNIManager.trialJobMetricListener (/home/quzha/.local/nni/core/nnimanager.js:33:18)\n at EventEmitter.emit (events.js:182:13)\n at PAIJobRestServer.handleTrialMetrics (/home/quzha/.local/nni/training_service/pai/paiJobRestServer.js:12:52)\n at router.post (/home/quzha/.local/nni/training_service/common/clusterJobRestServer.js:115:30)\n at Layer.handle [as handle_request] (/home/quzha/.local/nni/node_modules/express/lib/router/layer.js:95:5)\n at next (/home/quzha/.local/nni/node_modules/express/lib/router/route.js:137:13)\n at Route.dispatch (/home/quzha/.local/nni/node_modules/express/lib/router/route.js:112:3)' ]

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants