Existing materials/corpora
In this LibGuide we refer to text mining corpora and databases:
Not all text databases and corpora are suitable for text analysis due to different aspects; copyright, privacy, the type of file* and quality of text documentation**.
*The type of file must be supported by the text mining tool you are using.
**The text must be a complete representation of the text you are studying.
Creating corpora
Sometimes there is no existing corpus that will address your research question, so you will have to create your own corpus. When preparing your corpus it might be interesting to have a look at text preprocessing methods. It includes all the steps that are taken to make a text suitable for text analysis.