728x90

hugginface 3

[ AWS SageMaker & HuggingFace] The requested resource studio ... is not available in this region | 인턴

언제 또 이렇게 AWS를 마음대로 써 보겠는가!!! 기업체 최고다!!! 학생 신분에서 해 볼 수 없었던 마음 놓고 AWS 기능 쓰기를 진행하다 보면 **Failed to start kernel** Failed to launch app [sagemaker-data-scien-ml-g5-2xlarge-788bb6348367982dd036e22a2f37]. ResourceLimitExceeded: The requested resource studio/KernelGateway-ml.g5.2xlarge is not available in this region (Context: RequestId: 8c72bfde-4a66-44b2-9fd8-e05e5af45114, TimeStamp: 1698803755.498822..

[AWS SageMaker / HuggingFace] Training an 8-bit model is not supported yet. | 인턴

UnexpectedStatusException: Error for Training job huggingface-peft-2023-11-07-00-53-07-2023-11-07-02-17-27-231: Failed. Reason: AlgorithmError: ExecuteUserScriptError: ExitCode 1 ErrorMessage "raise ValueError( ValueError: The model you want to train is loaded in 8-bit precision. Training an 8-bit model is not supported yet." Command "/opt/conda/bin/python3.9 run_clm.py --dataset_path /opt/ml/in..

[AWS SageMaker / HuggingFace] NotImplementedError: Cannot copy out of meta tensor; no data! | 인턴

AWS SageMaker에서 모델을 돌리던 도중에 오류가 발생했다. MarkAny Document Safer Warning! : The Contents copied from encrypted document can not be pasted to non-encrypted one! Reason : AlorithmError: excuatreUserScriptError ExitCode 1 ErroeMessage "NotImplementedError: Cannot copy out of meta tensor; no data!" Command "\opt/conda/bin/python3.9 run_clm.py --dataset_path /opt/ml/input/data/training --epochs 3 --lr 0...

728x90