viernes, 9 de agosto de 2013

Cheliotis: Análisis de redes sociales


Visualización: Red bipartita de legislación y organizaciones

Visualizing Positive and Negative Endorsements of S.1782 (2007)

Skye Bender-de Moll
NehalemOregon

A bi-partitie network of legislation and organizations

Caption



This image depicts a network of bills (squares) and endorsing organizations (circles) around S.1782 during the 110th U.S. Congress. The green and red ties indicate support or opposition of S.1782 by an organization. Gray ties link to additional bills positively endorsed by organizations. The information was collected byMapLight.org from various public documents. Color indicates similar group categorization, and the size of the nodes is relative to the total number of endorsements it gave/received in the database. Mousing over small nodes will reveal the title or additional bill info. Clicking on bills will load an associated web page.
S.1782 was chosen as the focus for this extended ego-network visualization because the title and description of the bill give very little indication about its intent or possible effects. MapLight's table of supporters and opponents is quite helpful, but ideally it would be possible to simultaneously see where each of a bill's endorsing organizations stands on other issues in order to place their endorsement in context. This layout results in intersecting circles of legislative preference around groups with similar patterns of endorsements, revealing separation and overlap between the camps surrounding S.1782. Opposition for the bill seems to have come from large industry lobby groups, corporations,and business associations. Support came from consumer and activists groups. The "nays" seem to have won, as the bill died in committee.
The layout was produced using SoNIA and the MDSJ library. The MDS algorithm was run 150 iterations on the matrix of all-pairs-shortest-path distances with a scaling exponent of -7 to weight distant ties less. Some node positions were manually tweaked for legibility.

Self comentary

Note: for this example to function correctly in addition to the png image it most load a script file, an xml data file, and include javascript in the header of the page. Not sure how this will actually work with the JOSS journal formated web page.
Data on the entire set of bill endorsements were kindly provided by MapLight.org in csv from. I loaded it into a MySQL database so that I could experiment with various types of networks. Although, the co-endorsing, and co-endorsee networks of the legislative space were in some ways more interesting, they would require much more work to make a presentable image. I also wanted to test the feasibility of using visualization to learn something about an arbitrarily chosen bill. I wrote a utility program in java to facilitate the process of testing various queries to select the node attribute data and construct the tie relations. The queries are processed to produce a .son formatted network file, which was loaded into SoNIA for visualization.
I initially expected that using different tie weights for the S.1782 ties and the rest of the bills ties would help structure the network. The approach was not successful, so I ended up simply giving the direct ties greater width to increase the visual impact, and graying out the ties to the other bills to focus attention on the bill of interest. Earlier versions of the image were produced with a KK layout, but I found the MDSJ MDS layout was somewhat more stable, and allowed me to adjust the distance parameter to control the "clumpyness" of the (loosely) structurally equivalent node groups. I adjusted some node positions and shortened some labels to reduce clutter in the resulting layout. A challenge was making the labels legible without crowding the layout or distracting from the ties. In the end, I settled for making many of the labels too small to read, but including a mouseover option to show the labels as a "tooltip" on the image in a web browser.
To prepare for the web version, I exported the .PNG image, and an .XML file containing node coordinates and labels from SoNIA. I adapted some JavaScript code previously written by a co-developer to read the xml file and produce the image mouseovers. I added a feature to parse the titles of the bills into an appropriate url on the GovTrack site, making it possible to click through to more bill information. I also inserted bill titles for selected nodes directly into the tooltip, hopefully making it possibly to quickly get a feel for the type of bills in each area without cluttering the layout with the long labels. This is important, because the bill numbers themselves are not meaningful and I do not have (or know of) any bill classification data that could be used to help the viewer determine if the bill groupings produced by the layout make sense, or what is implied about an organizations political position by an endorsement.
Although I think this is an interesting image, I see several issues. One is that the nodes that only endorse a single bill do not have well defined positions in this layout, they tend to land in arbitrary regions and their positions are likely to be falsely assigned significance by the viewer. As with most network visualizations, the groupings might be more rigorously created with a clustering algorithm. There may also be data coverage and sampling issues in the underlying data.

PEER REVIEW COMMENT No. 1
I really appreciate the author’s use of a careful color scheme of support and opposition so the viewer may easily discern coalition blocks.  The visualization also includes other bills supported by each organization, which hints at a more contextualized story – would love to see this fully searchable/interactive. It appears that subgroups of organizations tend to support similar bills, but organizational coalitions are far from uniform; I’m guessing that sets of organizations may come together or fall apart depending on the particular bill of interest. Thus, while the visualization nicely captures the coalition structure for the particular bill of interest, there is, unfortunately, very little sense of how the other bills are interrelated.

PEER REVIEW COMMENT No. 2
The positioning of organizations that support and opposed the bill around Bill S 1782 made the two sides visually easy to locate. In addition to the corresponding edge color scheme, the image conveys a clear picture of the bill’s supporters as mainly activist groups and its opposition as mainly business organizations. While the color the organization is assigned shows their voting pattern, the reader may be overwhelmed with the volume of bills connected by grey lines and surrounding the organizations, would be nice to have a way to summarize the “similar” bills.

PEER REVIEW COMMENT No. 3
This visualization makes great use of user input, providing unabbreviated node identities during mouse-over, and detailed information when clicked. This provides a great circumvention of the trade-off between node information and graph cluster. It also employs an effective yet simple color/layout scheme to display a bipartite graph without the artificial separations that often stilt bipartite graphs. I wonder how this graph would look with the other edges colored (perhaps in faint red/green) according to whether the organization opposed S 1782 – would it give a sense of how firm these organizational battle lines hold across bills?

JoSS

jueves, 8 de agosto de 2013

Esquema de grupos en cajas para análisis de comunidades multifacéticos


Visualización: Diagrama de árbol radial de insomnio

Radial Tree Diagram [RTD] of Insomnia

Philip Topham
Lnx Research, General Manager

Radial Tree Diagram [RTD] of Insomnia



Self-Commentary
                Radial tree diagram [RTD] of Insomnia coauthor invisible college and their cliques (2005-2009) - RTD helps an analyst understand the Insomnia research community by bundling edges that normally obscure meaning in a traditional network layout.  The RTD also maintains overall network topology specific relationships are browsed, and thus avoids the viewer needing to reorient themselves as is the case in traditional spring-embedded or  force-directed approaches that redraw themselves depending on view and focus.

PEER REVIEW COMMENT No. 1
This visualization uses a radial tree diagram to contain a vast amount of information about the structure of co-authorships among insomnia researchers. The image makes good use of color and smooth lines and is visually appealing as a result. However the volume of information makes it difficult to put together a general sense of the network, this may be due in part to how the substantive areas determine position on the circle, rather than connectedness?  
PEER REVIEW COMMENT No. 2
This is a fascinating interactive bit; and I find myself continually “poking around” the visualization.  Very fun!  The colors, rendering and flow are beautiful.  Like many circular layouts, the trade between clarity of connection (highlighted here) and general topology (highlighted in space-based layouts) makes it difficult to discern “distance” between research groups.  Here a *little* more information on how nodes were placed around the circle might help. 

PEER REVIEW COMMENT No. 3
This flash-based visualization supercharges the circle graph into a luxuriously information-rich interactive medium.  It makes beautiful and effective use of color – using it when helpful, dropping it to deemphasize and avoid cluster.  I would be interested to see if additional functionally can be built into the user-interface – Perhaps some way to pull a group/individual bio with a double click?

miércoles, 7 de agosto de 2013

La red desplaza a los profesores de las aulas

How Big Data Is Taking Teachers Out of the Lecturing Business

Schools and universities are embracing technology that tailors content to students' abilities and takes teachers out of the lecturing business. But is it an improvement?


By Seth Fletcher

When Arnecia Hawkins enrolled at Arizona State University last fall, she did not realize she was volunteering as a test subject in an experimental reinvention of American higher education. Yet here she was, near the end of her spring semester, learning math from a machine. In a well-appointed computer lab in Tempe, on Arizona State's desert resort of a campus, she and a sophomore named Jessica were practicing calculating annuities. Through a software dashboard, they could click and scroll among videos, text, quizzes and practice problems at their own pace. As they worked, their answers, along with reams of data on the ways in which they arrived at those answers, were beamed to distant servers. Predictive algorithms developed by a team of data scientists compared their stats with data gathered from tens of thousands of other students, looking for clues as to what Hawkins was learning, what she was struggling with, what she should learn next and how, exactly, she should learn it.
Having a computer for an instructor was a change for Hawkins. “I'm not gonna lie—at first I was really annoyed with it,” she says. The arrangement was a switch for her professor, too. David Heckman, a mathematician, was accustomed to lecturing to the class, but he had to take on the role of a roving mentor, responding to raised hands and coaching students when they got stumped. Soon, though, both began to see some benefits. Hawkins liked the self-pacing, which allowed her to work ahead on her own time, either from her laptop or from the computer lab. For Heckman, the program allowed him to more easily track his students' performance. He could open a dashboard that told him, in granular detail, how each student was doing—not only who was on track and who was not but who was working on any given concept. Heckman says he likes lecturing better, but he seems to be adjusting. One definite perk for instuctors: the software does most of the grading for them.
At the end of the term, Hawkins will have completed the last college math class she will probably ever have to take. She will think back on this data-driven course model—so new and controversial right now—as the “normal” college experience. “Do we even have regular math classes here?” she asks.
Big Data Takes Education
Arizona State's decision to move to computerized learning was born, at least in part, of necessity. With more than 70,000 students, Arizona State is the largest public university in the U.S. Like institutions at every level of American education, it is going through some wrenching changes. The university has lost 50 percent of its state funding over the past five years. Meanwhile enrollment is rising, with alarmingly high numbers of students showing up on campus unprepared to do college-level work. “There is a sea of people we're trying to educate that we've never tried to educate before,” says Al Boggess, director of the Arizona State math department. “The politicians are saying, ‘Educate them. Remediation? Figure it out. And we want them to graduate in four years. And your funding is going down, too.’”
Two years ago Arizona State administrators went looking for a more efficient way to shepherd students through basic general-education requirements—particularly those courses, such as college math, that disproportionately cause students to drop out. A few months after hearing a pitch by Jose Ferreira, the founder and CEO of the New York City adaptive-learning start-up Knewton, Arizona State made a big move. That fall, with little debate or warning, it placed 4,700 students into computerized math courses. Last year some 50 instructors coached 7,600 Arizona State students through three entry-level math courses running on Knewton software. By the fall of 2014 ASU aims to adapt six more courses, adding another 19,000 students a year to the adaptive-learning ranks. (In May, Knewton announced a partnership with Macmillan Education, a sister company to Scientific American.)
Arizona State is one of the earliest, most aggressive adopters of data-driven, personalized learning. Yet educational institutions at all levels are pursuing similar options as a way to cope with rising enrollments, falling budgets and more stringent requirements for student achievement. Public primary and secondary schools in 45 states and the District of Columbia are rushing to implement new, higher standards in English-language arts and mathematics known as the Common Core state standards, and those schools need new instructional materials and tests to make that happen. Around half of those tests will be online and adaptive, meaning that a computer will tailor questions to each student's ability and calculate each student's score [see “Why We Need High-Speed Schools,” on page 69]. School systems are experimenting with a range of other adaptive programs, from math and reading lessons for elementary school students to “quizzing engines” that help high school students prepare for Advanced Placement exams. The technology is also catching on overseas. The 2015 edition of the Organization for Economic Co-operation and Development's Program for International Student Assessment (PISA) test, which is given to 15-year-olds (in more than 70 nations and economies so far) every three years, will include adaptive components to evaluate hard-to-measure skills such as collaborative problem solving.
Proponents of adaptive learning say that technology has finally made it possible to deliver individualized instruction to every student at an affordable cost—to discard the factory model that has dominated Western education for the past two centuries. Critics say it is data-driven learning, not traditional learning, that threatens to turn schools into factories. They see this increasing digitization as yet another unnecessary sellout to for-profit companies that push their products on teachers and students in the name of “reform.” The supposedly advanced tasks that computers can now barely pull off—diagnosing a student's strengths and weaknesses and adjusting materials and approaches to suit individual learners—are things human teachers have been doing well for hundreds of years. Instead of delegating these tasks to computers, opponents say, we should be spending more on training, hiring and retaining good teachers.
And while adaptive-learning companies claim to have nothing but the future of America's children in mind, there is no denying the potential for profit. Dozens of them are rushing to get in on the burgeoning market for instructional technology, which is now a multibillion-dollar industry [see box at left]. As much as 20 percent of instructional content in K–12 schools is already delivered digitally, says Adam Newman, a founding partner of the market-analysis firm Education Growth Advisors. Although adaptive-learning software makes up only a small slice of the digital-instruction pie—around $50 million for the K–12 market—it could grow quickly. Newman says the concept of adaptivity is already very much in the water in K–12 schools. “In K–12, the focus has been on differentiating instruction for years,” he says. “Differentiating instruction, even without technology, is really a form of adaptation.”
Higher-education administrators are warming up to adaptivity, too. In a recent Inside Higher Ed/Gallup poll, 66 percent of college presidents said they found adaptive-learning and testing technologies promising. The Bill & Melinda Gates Foundation has launched the Adaptive Learning Market Acceleration Program, which will issue 10 $100,000 grants to U.S. colleges and universities to develop adaptive courses that enroll at least 500 students over three semesters. “In the long term—20 years out—I would expect virtually every course to have an adaptive component of some kind,” says Peter Stokes, an expert on digital education at Northeastern University. That will be a good thing, he says—an opportunity to apply empirical study and cognitive science to education in a way that has never been done. In higher education in particular, “very, very, very few instructors have a formal education in how to teach,” he says. “We do things, and we think they work. But when you start doing scientific measurement, you realize that some of our ways of doing things have no empirical basis.”
The Science of Adaptivity
In general, “adaptive” refers to a computerized-learning interface that constantly assesses a student's thinking habits and automatically customizes material for him or her. Not surprisingly, though, competitors argue ferociously about who can claim the title of true adaptivity. Some say that a test that does nothing more than choose your next question based on whether you get the item in front of you correct—a test that steers itself according to the logic of binary branching—does not, in 2013, count as fully adaptive. In this view, adaptivity requires the creation of a psychometric profile of each user, plus the continuous adjustment of the experience based on that person's progress.
To make this happen, adaptive-software makers must first map the connections among every concept in a piece of learning material. Once that is done, every time a student watches a video, reads an explanation, solves a practice problem or takes a quiz, data on the student's performance, the effectiveness of the content, and more flow to a server. Then the algorithms take over, comparing that student with thousands or even millions of others. Patterns should emerge. It could turn out that a particular student is struggling with the same concept as students who share a specific psychometric profile. The software will know what works well for that type of student and will adjust the material accordingly. With billions of data points from millions of students and given enough processing power and experience, these algorithms should be able to do all kinds of prognostication, down to telling you that you will learn exponents best between 9:42 and 10:03 a.m.
They should also be able to predict the best way to get you to remember the material you are learning. Ulrik Juul Christensen, CEO of Area9, the developer of the data-analysis software underpinning McGraw-Hill's adaptive LearnSmart products, emphasizes his company's use of the concept of memory decay. More than two million students currently use LearnSmart's adaptive software to study dozens of topics, either on their own or as part of a course. Research has shown that those students (all of us, really) remember a new word or fact best when they learn it and then relearn it when they are just on the cusp of forgetting it. Area9's instructional software uses algorithms to predict each user's unique memory-decay curve so that it can remind a student of something learned last week at the moment it is about to slip out of his or her brain forever.
Few human instructors can claim that sort of prescience. Nevertheless, Christensen dismisses the idea that computers could ever replace teachers. “I don't think we are so stupid that we would let computers take over teaching our kids,” he says.
Backlash
In March, Gerald J. Conti, a social studies teacher at Westhill High School in Syracuse, N.Y., posted a scathing retirement letter to his Facebook page that quickly became a viral sensation. “In their pursuit of Federal tax dollars,” he wrote, “our legislators have failed us by selling children out to private industries such as Pearson Education,” the educational-publishing giant, which has partnered with Knewton to develop products. “My profession is being demeaned by a pervasive atmosphere of distrust, dictating that teachers cannot be permitted to develop and administer their own quizzes and tests (now titled as generic ‘assessments’) or grade their own students' examinations.” Conti sees big data leading not to personalized learning for all but to an educational monoculture: “STEM [science, technology, engineering and mathematics] rules the day, and ‘data driven’ education seeks only conformity, standardization, testing and a zombie-like adherence to the shallow and generic Common Core.”
Conti's letter is only one example of the backlash building against tech-oriented, testing-focused education reform. In January teachers at Garfield High School in Seattle voted to boycott the Measures of Academic Progress (MAP) test, administered in school districts across the country to assess student performance. After tangling with their district's superintendent and school board, the teachers continued the boycott, which soon spread to other Seattle schools. Educators in Chicago and elsewhere held protests to show solidarity. In mid-May it was announced that Seattle high schools would be allowed to opt out of MAP, as long as they replaced it with some other evaluation.
It would be easy for proponents of data-driven learning to counter these protests if they could definitely prove that their methods work better than the status quo. But they cannot do that, at least not yet. Empirical evidence about effectiveness is, as Darrell M. West, an adaptive-learning proponent and founder of the Brookings Institution's Center for Technology Innovation, has written, “preliminary and impressionistic.” Any accurate evaluation of adaptive-learning technology would have to isolate and account for all variables: increases or decreases in a class's size; whether the classroom was “flipped” (meaning homework was done in class and lectures were delivered via video on the students' own time); whether the material was delivered via video, text or game; and so on. Arizona State says 78 percent of students taking the Knewton-ized developmental math course passed, up from 56 percent before. Yet it is always possible that more students are passing not because of technology but because of a change in policy: the university now lets students retake developmental math or stretch it over two semesters without paying tuition twice.
Even if proponents of adaptive technology prove that it works wonderfully, they will still have to contend with privacy concerns. It turns out that plenty of people find pervasive psychometric-data gathering unnerving. Witness the fury that greeted inBloom earlier this year. InBloom essentially offers off-site digital storage for student data—names, addresses, phone numbers, attendance, test scores, health records—formatted in a way that enables third-party education applications to use it. When inBloom was launched in February, the company announced partnerships with school districts in nine states, and parents were outraged. Fears of a “national database” of student information spread. Critics said that school districts, through inBloom, were giving their children's confidential data away to companies who sought to profit by proposing a solution to a problem that does not exist. Since then, all but three of those nine states have backed out.
This might all seem like overreaction, but to be fair, adaptive-education proponents already talk about a student's data-generated profile following them throughout their educational career and even beyond. Last fall the education-reform campaign Digital Learning Now released a paper arguing for the creation of “data backpacks” for pre-K–12 students—electronic transcripts that kids would carry with them from grade to grade so that they will show up on the first day of school with “data about their learning preferences, motivations, personal accomplishments, and an expanded record of their achievement over time.” Once it comes time to apply for college or look for a job, why not use the scores stored in their data backpacks as credentials? Something similar is already happening in Japan, where it is common for managers who have studied English with the adaptive-learning software iKnow to list their iKnow scores on their resumes.
This Is Not a Test
It is far from clear whether concerned parents and scorned instructors are enough to stop the march of big data on education. “The reality is that it's going to be done,” says Eva Baker, director of the Center for the Study of Evaluation at the University of California, Los Angeles. “It's not going to be a little part. It's going to be a big part. And it's going to be put in place partly because it's going to be less expensive than doing professional development.”
That does not mean teachers are going away. Nor does it mean that schools will become increasingly test-obsessed. It could mean the opposite. Sufficiently advanced testing is indistinguishable from instruction. In a fully adaptive classroom, students will be continually assessed, with every keystroke and mouse click feeding a learner profile. High-stakes exams could eventually disappear, replaced by the calculus of perpetual monitoring.
Long before that happens, generational turnover could make these computerized methods of instruction and testing, so foreign now, unremarkable, as they are for Arizona State's Hawkins and her classmates. Teachers could come around, too. Arizona State's executive vice provost Phil Regier believes they will, at least: “I think a good majority of the instructors would say this was a good move. And by the way, in three years 80 percent of them aren't going to know anything else.”
Take an adaptive quiz on state capitals at ScientificAmerican.com/aug2013/learn-smart

Visualización: Estructuras de comunidad radiales

Pinwheel Layout to Highlight Community Structure

Bernie Hogan
Research Fellow, Oxford Internet Institute

Pinwheel Layout to Highlight Community Structure

Caption
The intuitive appeal of force-directed layouts is due in part to their ability to represent underlying community structures. Such diagrams show dense pockets of nodes with bridges connecting across clusters. Yet, it is possible to start, rather than end, with community structure. This is a “pinwheel” diagram using the author’s Facebook personal network (captured July 15, 2009). Nodes represent the author’s friends and links represent friendships among them. The author is not shown. Each ‘wing’ radiating outwards is a partition using a greedy community detection algorithm (Wakita and Tsurumi, 2007). Wings are manually labelled. Node ordering within each wing is based on degree. Node color and size is also based on degree. Nodes position is based on a polar coordinate system: each node is on an equal angle of n/360º with a radius being a log-scaled measure of betweenness. Higher values are closer to the center indicating a sort of cross-partition ‘gravity’.
This layout has several notable features:

                - The angle of each wing is proportionate to its share of the network. Thus 25 percent of nodes go from 0 to 90º.
                - Partitions are distinguished by their position rather than a node’s color or shape.
                - The tail indicates the periphery of each partition. A wing with many tail nodes indicates many people who are only tied to other group members.
                - Edges crossing the center show between-partition connections. Since nodes are sorted by degree it is easy to see if edges originate from the most highly connected nodes or the entire partition.

                This visualization is oriented towards well-connected modular networks (meaning they are easily partitioned into distinct communities). Facebook egocentered networks often have these properties, whereby each partition represents a life course stage or social context and close friends link between partitions.
                In this network it is easy to see a strong series of linkages between high school and university as well as high school and family. There are many ties between the current co-workers and professional colleagues, and neither connects substantially to high school. While just as populous, the professional partition is far less dense than the high school partition.

Self-Commentary
                The data was captured using a custom-built publicly available Facebook application. This application employs the Facebook API to query for a user’s friends and the connections between these friends. The clustering and layout was done using NodeXL, a network analysis add-on for Excel 2007.
                To create this diagram, I “hacked” many of the features of NodeXL. For example, to layout nodes within each partition according to degree I had to first convert the cluster names to cluster numbers. Since degree has a maximal value of n-1, I multiplied each node’s cluster number by n and then added degree thereby ensuring no overlaps. The polar coordinate system does not pay attention to the layout order, so I first laid the nodes out using a circle layout and converted the X and Y coordinates to Radians. The betweenness values only span half of the radius rather than the full radius because otherwise links between adjacent communities would look messy.
                The viewer may notice that I used degree for three metrics: within-partition ordering, size and color. The combination of all three gives the nodes its wing-like shape and gradient. Regardless of the metric used for within-cluster ordering, I recommend also using it for color and shape as well.
                NodeXL is very flexible but it still has many limitations. For example, it currently only includes one community detection algorithm (although it is simple enough to paste in other partitions done through other packages). This particular algorithm misclassified a handful of nodes leading to unnecessary edge-crossings.

PEER REVIEW COMMENT No. 1
In this representation of Facebook, partitions are denoted by position rather than color – which is a clever way to layer more information into the figure. The groups breakdown along foci, or social contexts -high school, camp, college, work etc, and the pinwheel figure nicely illustrates the different foci in ego's network.  However, one wishes the highly interconnected foci were positioned closer
together along the wheel in some way that would minimize the edge-crossing in the middle.  I wonder if the duplication of degree in size, color and position is necessary, or if one could usefully integrate other information through color (say ratio of ties within and between foci), to increase the information content of the image.

PEER REVIEW COMMENT No. 2
The pinwheel layout highlights the organizationally focused nature of friendship ties in the author’s ego network. The diagram captures the relationships that span organizational settings, and more importantly, it does a very good job of showing how densely (or sparsely) the settings are connected with each other. This effect is aided by having the set of ties related to each setting be sized proportional to its contribution to the ego network. The visualization may be improved by imposing some sort of order on where the settings themselves are placed, so that they are chronological or so that settings that share more ties are next to each other.

PEER REVIEW COMMENT No. 3
This submission successfully rebels against the convention force-directed layout, reengineering the circular layout in an information-rich way.  The degree ordering is effective in allowing one to get a sense of how different partitions connect to each other, both in terms of number and distribution of cross-ties.  I'd love to see if this layout would work semi-circle (which might permit degree ordering to be arranged in a roughly up-down fashion while still preserving its radial merits).