프롬프트관련
프롬프트 관련사항
프롬프트 종류
This prompt falls within the category "Advice in relationships", and has 5 constraints (must be appropriate for a 64-year old from China, not mahjong-related, not a "top 10" gift, bulleted format, $200 budget)
- Category?
- 답변평가
- . prompt following
- . Accuracy - 정확성이면 4,5점 수준 (도움을 주는 정도)
- . Fluency
- . Localization
- . Fromatting
비슷하면 - 중간, 좌우 선택
- Justification
- 이해도: 30% 오류는 drop할 것 (NO)
- 정확도: 1,2,3 - 영아니다 3
- 완전도 - answer 완성도
- 유창성 - 다른언어, 문법,철자, 문장구조
- Format - 리스트, 문장 설명 등 타입
- Justification?
Justification is a concise rationale that gives insight into why we provided our preferences in rating and ranking the responses. Justifications share key differentiators between the responses, highlight these factors with reasonably detailed evidence, and constructively critique the responses in potential improvement areas.
Justifications allow reviewers, the project team, and the client to better understand your thinking and preferences so that we can provide accurate feedback, useful and targeted training, and improve model alignment with human preferences (e.g., values, satisfaction).
Justifications are often read and evaluated during reviews and audits, even before the prompt, response, or ratings. Clear yet specific justifications go a long way. The best justifications provide reviewers and auditors insights into the responses without having to read the prompts or responses.
- Poor justification
- Omitting the main subject or topic
- Lacks evidence or supporting claims from the responses or research
- 비슷한 경우, 약간 더 좋다고 할 것
- Answers with superficial rationale (피상적 근거/이론)
- unnecessarily verbose, 불필요하게 장황 extraneous - 관련 없는, rambling - 횡설수설하는 employing flowery languae (화려한 수사 사용)
- Differentiating factors - 차별화 요소
- Good Justification.
- preference - 선호(도)
- 구체적으로 빠진 부분 설명
- Touch on what would make it better. 언급하다, touch on
The Potential Intentions of the Mention (PIMs) provided in the task may not accurately caputre the Mention's true intent.
Potential Intentions - 잠재적 의도
ML학습 팁
Remember you just need to find MINOR issues in one of the responses!
After doing a bunch of tasks myself, I've learned that niche details (e.g. birth cities of Indian CEOs, this strategy is really good for Open/Closed QA!),
complex instructions (e.g. Italicize any Māori words and include their translation, first year of year or century of usage after the italicized word in the same parentheses. If there is no year/century stated in the text, please use N/A), and riddles (for Open QA) are great ways to trick the model!
Most contributors first find a reference text and then try to create prompts around it, which is inefficient. Instead, start with topics and strategies in mind, then select a suitable reference text.
For example, I receive an "en-New Zealand" task. My immediate thought is "What are some lesser known details about New Zealand that will make the model fail? and my mind goes to Māori language or customs because I didn't learn a lot about it in grade school. Other options could include details of local New Zealand cricketers or New Zealand flora and fauna—you know best for your language or country.
- 복잡한 프롬프트
"Summarize the below text. Italicize any Māori words and include their translation, [comma] first year/century of usage in parentheses after the word. If no year/century is stated, use N/A."
"find and italicize the Maori words and include translations in parentheses" (괄호)
I quickly pivoted and made the constraint more complex. (방향을 바꾸어, 제약 조건을 더 복잡하게)
- 간략한 평가
Keep it to 1-2 short sentences. For example, "There are Truthfulness issues because the number of goals statistic is incorrect for Messi - it's 15, not 12" or "There are Instruction Following issues because the response does not have bullet points".
- 생각해 볼 것
객관적으로 보기, 질문(prompt)로 하고자 하는 것이 충족되었는가 관점 유지
사소한 문제의 기준은 3개가 넘어가면 중요 문제
오류
- . Missing truthfulness errors: The first big bucket of errors is missing hallucinations (환각성 있음, 지어낸 말)
- . Missing instruction following errors: The next biggest cause of task defects revolves around contributors missing a constraint. (제약X)
- . Confusing dimensions with one another: Two fields sometimes appear to overlap, leading to confusion and flipped ratings. (차원 혼동)
- Confusing instruction following and truthfulness errors (지시혼란, 진실 오류)
- Confusing instruction following and writing quality (지시혼란, 작성 오류)(
- Confusing writing quality and localization
- Confusing verbosity and writing quality
- Confusing verbosity and instruction following (장황함, 지시불이행)
- . Knowing when external data is acceptable vs not (외부데이터 허용)
- . Identifying and adding good reference texts: People are confused about where to put the reference text and what constitutes a good reference text (참조텍스트 여부와 내용)
- . Detecting Pleasantries: What is a pleasantry? What type of error is it? (친절함과 오류)
- . Misunderstanding prompt category definitions: People have confusion over the prompt category definitions (범주 오해)
- 종류
- Q/A
- Summarization
- Rewrite
- Extraction
- 지시 오류 회피
by removing all subjectivity in assessing if the constraint is followed by making your constraint easily verifiable! 주관성 배제, 제약조건을 쉽게 검증할 수 있도록 하는 것.
제한 글자수 안지킴, 포멧 오류, 형태 오류, 제약 안지킴
제약 조건 - 기록하기!
- . 핵심
- 세부 조건
.