Skip to content

Can't reproduce the performance of Instruction following #166

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
NaOH678 opened this issue Apr 28, 2025 · 0 comments
Open

Can't reproduce the performance of Instruction following #166

NaOH678 opened this issue Apr 28, 2025 · 0 comments

Comments

@NaOH678
Copy link

NaOH678 commented Apr 28, 2025

Hello! I follow the #50 but can't reproduce the result of 1k data in the paper. I only got the win-rate of 0.4。The hyparameters and code are as following:

CUDA_VISIBLE_DEVICES=1,3 accelerate launch --num_processes=2 multi_train/train.py \
    --output_dir 'multi_train/trainer_out_put/direft_paper_hparam_helpful_1kdata_loreft' \
    --max_samples 1000 \
    --model_name_or_path ../Llama-2-7b-hf/ \
    --per_device_train_batch_size 2 \
    --subspace_rank 4 \
    --warmup_ratio 0 \
    --weight_decay 0 \
    --learning_rate 9e-4 \
    --gradient_accumulation_steps 16 \
    --subtask helpful \
    --num_train_epochs 9 \
    --position f5+l5 \
    --dropout 0.05 \
    --target_layers 3 9 18 24 \
    --model_max_length 768 \
    --save_strategy "epoch" \

accelerator = Accelerator()
    rank = accelerator.process_index


    parser = HfArgumentParser(
        (ReftArguments, TrainingArguments, DataArguments)

    )
    (
        reftargs,
        training_args,
        data_args,

    ) = parser.parse_args_into_dataclasses()

    subspace_rank = reftargs.subspace_rank

    
    set_seed(training_args.seed)
    model_name_or_path = training_args.model_name_or_path # yahma/llama-7b-hf or yahma/llama-13b-hf
    model = AutoModelForCausalLM.from_pretrained(
        model_name_or_path, 
        torch_dtype=torch.bfloat16
    )

    # get tokenizer
    model_max_length = training_args.model_max_length
    tokenizer = AutoTokenizer.from_pretrained(
        model_name_or_path, model_max_length=model_max_length, 
        padding_side="right", use_fast=False)
    tokenizer.pad_token = tokenizer.unk_token


    # load data
    if data_args.max_samples:
        max_samples = data_args.max_samples
    
    else:
        # 暂时不能使用!!!
        percentage = data_args.percentage

    
    subtask = reftargs.subtask
    

    elif subtask == 'helpful':
        data = load_dataset('json',data_files='/dataset/ultra_feedback.json')['train']
        # helpful_data = helpful_data.select(range(min(max_samples, len(helpful_data))))
    
    
 
    # print(type(reftargs.target_layers))
    # print(reftargs.target_layers)
    if reftargs.target_layers == [-1]:
        TARGET_LAYERS = list(range(len(model.model.layers)))
    else:
        TARGET_LAYERS = reftargs.target_layers

    #get reft model
 
    reft_config = ReftConfig(representations=[
        {
            "layer": layer, "component": "block_output",
            "intervention": LoreftIntervention(
            embed_dim=model.config.hidden_size, 
            low_rank_dimension=reftargs.subspace_rank, 
            dropout=training_args.dropout,
            add_bias=False)
        }
        for layer in TARGET_LAYERS
        ]
    )
    
    reft_model = get_reft_model(model, reft_config)
    reft_model.print_trainable_parameters()


    train_dataset = ReftSupervisedDataset(
        "Subloreft", None, tokenizer, dataset=subspace_dataset,
        **{"num_interventions": len(reft_model.interventions), "position": reftargs.position , "share_weights": True},        
        input_field="input", instruction_field="instruction", output_field="output", 
        seed=training_args.seed, max_n_example=data_args.max_samples,
        no_stop=False
    )
    print(train_dataset[0])


    data_collator_fn = transformers.DataCollatorForSeq2Seq(
        tokenizer=tokenizer,
        model=model,
        label_pad_token_id=-100,
        padding="longest"
    )
    data_collator = ReftDataCollator(data_collator=data_collator_fn)

    if rank == 0:
        # os.environ["WANDB_MODE"] = "offline"  
        wandb.init(project=f"Reft_{reftargs.subtask}", name=f"first_train_{reftargs.subtask}")
        print(torch.cuda.device_count())
        wandb.log(dict(
            num_gpus=torch.cuda.device_count(),
            num_train_epochs=training_args.num_train_epochs, 
            learning_rate=training_args.learning_rate, 
            per_device_train_batch_size=training_args.per_device_train_batch_size, 
            gradient_accumulation_steps=training_args.gradient_accumulation_steps,
            warmup_ratio=training_args.warmup_ratio,
            weight_decay=training_args.weight_decay,
            positions=reftargs.position,
            dropout=training_args.dropout,
            subspace_rank=reftargs.subspace_rank,
            target_layers=reftargs.target_layers,
            seed=training_args.seed
        ))     

    training_args = transformers.TrainingArguments(
        num_train_epochs=training_args.num_train_epochs, 
        output_dir=training_args.output_dir, 
        learning_rate=training_args.learning_rate, 
        report_to='wandb',
        per_device_train_batch_size=training_args.per_device_train_batch_size, 
        logging_steps=1,
        ddp_find_unused_parameters=False, 
        gradient_accumulation_steps=training_args.gradient_accumulation_steps,
        warmup_ratio=training_args.warmup_ratio,
        save_total_limit=10,
        save_strategy=training_args.save_strategy,
        weight_decay=training_args.weight_decay,
        seed=training_args.seed
    )

    
    trainer =ReftTrainerForCausalLMDistributed(
        model=reft_model, 
        tokenizer=tokenizer, 
        args=training_args, 
        train_dataset=train_dataset, 
        eval_dataset=None, 
        data_collator=data_collator
    )
    if dist.is_initialized():
        dist.barrier()
    
    trainer.train(resume_from_checkpoint=False)
    trainer.save_state()


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant