Text this: Parallel and Distributed Statistical-based Extraction of Relevant Multiwords from Large Corpora