The U.S. government incarcerates over 1,500,000 inmates in the prison system, which is the largest in the world and keeps growing. Amid concerns about the unsustainable growth of our prison system, begging in the 1980s, prison privatization became a booming industry under government programs to cut back on the federal workforce. The Justice Departments has been contracting private prison corporations for the incarceration of prisoners.
To provide clients of the private prison industry a status report of the current market, this project aims to survey the prisoner population dynamics of the U.S. justice and correction system, and summarize the longitudinal trends of incarceration. A forecast model is built by learning the pattern presented in past data and to predict change of the prisoner population in the future. This study can offer important information for business decision for clients and investors in the prison industry, such as regional expansion of revenue and contract for operation of custody facilities in the future.
Numerous articles are being posted on the internet on a daily basis. For a platform that facilitates members sharing their ideas through writing and reading, creating an index system for information of such a high volume can be a challenging task. Although many approaches can be taken to design such systems, there are two general families, keyword-based and theme-based.
Keyword-based systems give authors the freedom to label their own work. By specifying keywords or #tags, the authors allow readers to search for information that aligns with their interest through author-generated labels. Albeit very flexible and dynamic, a keyword-based indexing system can be very costly to implement and maintain. Because keywords may be submitted by the author in a much more arbitrary way, it embeds uncertainty in the process of searching and retrieving information. For instance, a word may have various cases and derivations, and different words can also be semantically similar. In order for the readers’ search to hit the target accurately and comprehensively, the index system has to be able to understand different cases and variations of the same word, as well as establish semantic similarity.
On the other hand, a theme-based system only allows articles to be labeled with predefined topics. Instead of dealing with a potentially infinite number of keywords, a theme-based indexing system can be more organized, and help readers quickly narrow down the scope of their search. Such advantages make it an ideal complementary system to keywords.
This project aims to explore different machine learning approaches that can automate the task of classifying articles into different themes according to their titles. The final classification algorithm should be able to suggest possible themes based on titles. The semi-automatic algorithm should suggest the most likely categories for the publisher/creator to choose from and help promote a consistent framework that facilitates a more robust and efficient indexing and searching system.
The similar principles can also be adopted for business where many different departments are involved in handling a great number of requests from client. A recommendation system can suggest to the agents possible categories a ticket should belong to. Having such a system could significantly increase the human operators' efficiency as they would have been required to memorize all available categories/departments.