[kw-pm] Talks for this week!

abez abez at abez.ca
Mon Jan 18 19:38:12 PST 2010


Thursday, January 21, 2010 @ 7:00 in the University of Waterloo Campus,
Davis Centre room 3323.

Abram (me) shall give 2+ talks!

You'll learn about OpenID! Email extracting/parsing! Topic analysis and
natural language processing!

Subversion of OpenID 10mins+

I'll talk about the structure of OpenID, and give an overview of the
protocol. As well I'll discuss how to use Net::OpenID::Server, how to
make sure it is running quickly. Then I'll discuss the hoops I had to
jump through on shared hosting to get it all working.
Email Extractor: 10mins+

Email Extractor: I will go over an email extractor I wrote with the
great help of Mail::MboxParser, which I use to analyze emails. I will go
over what Mail::MboxParser has to offer, ways to clean up data, and
potential uses for such an extraction.

I will also discuss other modules like Mail::Box and how the modules
differ, why you would use either and how you port some of the missing
functionality over.

(implementation driven)
Optional: What's Hot and What's Not: Windowed Developer Topic

Analysis: 20mins+

    This talk will go over the idea of topic analysis which is a
    technique you can apply to datasets consisting mostly of text. This
    work was presented at ICSM 2009 this year in Edmonton.


What's Hot and What's Not: Windowed Developer Topic Analysis

As development on a software project progresses, developers shift their
focus between different topics and tasks many times. Managers and
newcomer developers often seek ways of understanding what tasks have
recently been worked on and how much effort has gone into each; for
example, a manager might wonder what unexpected tasks occupied their
team's attention during a period when they were supposed to have been
implementing new features. Tools such as Latent Dirichlet Allocation
(LDA) and Latent Semantic Indexing (LSI) can be used to extract a set of
independent topics from a corpus of commit-log comments. Previous work
in the area has created a single set of topics by analyzing comments
from the entire lifetime of the project. In this paper, we propose
windowing the topic analysis to give a more nuanced view of the system's
evolution. By using a defined time-window of, for example, one month, we
can track which topics come and go over time, and which ones recur. We
propose visualizations of this model that allows us to explore the
evolving stream of topics of development occurring over time. We
demonstrate that windowed topic analysis offers advantages over topic
analysis applied to a project's lifetime because many topics are quite
local.

(slides and colors wowee, interactive in terms of questions or tutorial
aspect only)



-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: OpenPGP digital signature
URL: <http://mail.pm.org/pipermail/kw-pm/attachments/20100118/c7eaf98e/attachment.bin>


More information about the kw-pm mailing list