[AWS SageMaker / HuggingFace] Training an 8-bit model is not supported yet.

☁️ 뭉게뭉게 클라우드/🚨 ERR

[AWS SageMaker / HuggingFace] Training an 8-bit model is not supported yet. | 인턴

우주수첩 2023. 11. 7. 11:58

728x90

UnexpectedStatusException: Error for Training job huggingface-peft-2023-11-07-00-53-07-2023-11-07-02-17-27-231: Failed. Reason: AlgorithmError: ExecuteUserScriptError:
ExitCode 1
ErrorMessage "raise ValueError(
 ValueError: The model you want to train is loaded in 8-bit precision. Training an 8-bit model is not supported yet."
Command "/opt/conda/bin/python3.9 run_clm.py --dataset_path /opt/ml/input/data/training --epochs 3 --lr 0.0002 --model_id bigscience/bloomz-7b1 --per_device_train_batch_size 1", exit code: 1

오류가 나따!!

나는 AWS에서 제공한 예시코드를 폴랑폴랑 따라가고 있었는데 오류가 나버려따.

8bit 모델을 지원하지 않는다고 한다.

근데 내가 하고있는건 데이터를 int8로 양자화 해서 진행하는 fine-Tunning인데?!?! ㅇㅅㅇ??

오류 해결을 목적으로 한다면 방법은 간단하댜.

def training_function(args):
    # set seed
    set_seed(args.seed)

    dataset = load_from_disk(args.dataset_path)
    # load model from the hub
    model = AutoModelForCausalLM.from_pretrained(
        args.model_id,
        use_cache=False if args.gradient_checkpointing else True,  # this is needed for gradient checkpointing
        device_map="auto",
        load_in_8bit=False,
        
    )
   중략
    )

model을 선언 할 때 load_in_8bit를 False로 변경 해주고

def parse_arge():
    """Parse the arguments."""
    parser = argparse.ArgumentParser()
   
   중략

def create_peft_config(model):
    from peft import (
        get_peft_model,
        LoraConfig,
        TaskType,
        # prepare_model_for_int8_training,
    )

    peft_config = LoraConfig(
        task_type=TaskType.CAUSAL_LM,
        inference_mode=False,
        r=8,
        lora_alpha=32,
        lora_dropout=0.05,
        target_modules=["query_key_value"],
    )

    # prepare int-8 model for training
    # model = prepare_model_for_int8_training(model)
    model = get_peft_model(model, peft_config)
    model.print_trainable_parameters()
    return model

argument를 설정하는 함수가 실행 될때 int8에 관련된 모든 코드를 주석 처리하면

위의 오류가 발생되지 않는다.

이는 곧 양자화를 진행하지 않은 데이터를 모델에 적용하도록 하는 건데

그러케 하면 말이다....

2023.11.07 - [🏝️ 멋찐넘 AWS] - [슬기로운 인턴 생활 | AWS SageMaker / HuggingFace] NotImplementedError: Cannot copy out of meta tensor; no data!

[슬기로운 인턴 생활 | AWS SageMaker / HuggingFace] NotImplementedError: Cannot copy out of meta tensor; no data!

AWS SageMaker에서 모델을 돌리던 도중에 오류가 발생했다. MarkAny Document Safer Warning! : The Contents copied from encrypted document can not be pasted to non-encrypted one! Reason : AlorithmError: excuatreUserScriptError ExitCode 1 Erroe

dusty-wznt.tistory.com

이런 오류가 난다 ^^

퉤.

728x90

'☁️ 뭉게뭉게 클라우드 > 🚨 ERR' 카테고리의 다른 글

[ AWS SageMaker & HuggingFace] The requested resource studio ... is not available in this region \| 인턴 (2)	2023.11.09
[AWS SageMaker / HuggingFace] NotImplementedError: Cannot copy out of meta tensor; no data! \| 인턴 (0)	2023.11.07
[AWS] 와장창쿵챵 ec2 파일 전송 및 오류 해결 (0)	2022.07.28
[AWS] EC2 인스턴스 SSH 연결 : ERR_CONNECTION_TIMED_OUT (0)	2022.07.06

현재글[AWS SageMaker / HuggingFace] Training an 8-bit model is not supported yet. | 인턴

아직 알에서 못나온 코딩 삐약이🐣

글쓰기✏️ | 관리자😎 | 방명록🎶

퍼듀, barkingdog, ksw, 프로그래머스, 자바, hash, DP, C++, Stack, boj, 바킹독, 알고리즘, 백준, Java, aws, IoT, BFS, Algorithm, 해시, Purdue,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

우주수첩의 우주먼지