parser = BiaffineTransformerDependencyParserTF()
parser.fit(PTB_SD330_TRAIN, PTB_SD330_DEV, save_dir,
‘albert-xxlarge-v2’,
batch_size=256,
warmup_steps_ratio=.1,
token_mapping=PTB_TOKEN_MAPPING,
samples_per_batch=150,
transformer_dropout=.33,
learning_rate=2e-3,
learning_rate_transformer=1e-5,
# early_stopping_patience=10,
)
在使用这个模型训练是出现了如下错误,请问何老师是配置原因还是其他的