* fix commands in readme, using new arg format * fix typo * add required -i flag to chat_eval example runs