generic method to extract text layer of a document