Articles

8.7: Krackhardt's Graph Theoretical Dimensions of Hierarchy - Mathematics

8.7: Krackhardt's Graph Theoretical Dimensions of Hierarchy - Mathematics


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Embedding of actors in dyads, triads, neighborhoods, clusters, and groups are all ways in which the social structure of a population may display "texture". All of these forms of embedding structures speak to the issue of the "horizontal differentiation" of the population - separate, but not necessarily ranked or unequal groupings.

A very common form of embedding of actors in structures, though, does involve unequal rankings. Hierarchies, in which individuals or sub-populations are not only differentiated, but also ranked, are extremely common in social life. The degree of hierarchy in a population speaks to the issue of "vertical differentiation".

While we all have an intuitive sense of what it means for a structure to be a hierarchy, most would agree that structures can be "more or less" hierarchical. It is necessary to be quite precise about the meaning of the term if we are going to build indexes to measure the degree of hierarchy.

Krackhardt (1994) provided an elegant definition of the meaning of hierarchy, and developed measures of each of the four component dimensions of the concept that he identified. Krackhardt defines a pure, "ideal typical" hierarchy as an "out-tree" graph. An out-tree graph is a directed graph in which all points are connected, and all but one node (the "boss") has an in-degree of one. This means that all actors in the graph (except the ultimate "boss") have a single superior node. The simplest "hierarchy" is a directed line graph A to B to C to D... More complex hierarchies may have wider, and varying "spans of control" (out-degrees of points).

This very simple definition of the pure type of hierarchy can be deconstructed into four individually necessary and jointly sufficient conditions. Krackhardt develops index numbers to assess the extent to which each of the four dimensions deviates from the pure ideal type of an out-tree, and hence develops four measures of the extent to which a given structure resembles the ideal typical hierarchy.

1) Connectedness: To be a pure out-tree, a graph must be connected into a single component - all actors are embedded in the same structure. We can measure the extent to which this is not true by looking at the ratio of the number of pairs in the directed graph that are reachable relative to the number of ordered pairs. That is, what proportion of actors cannot be reached by other actors? Where a graph has multiple components - multiple un-connected sub-populations - the proportion not reachable can be high. If all the actors are connected in the same component, if there is a "unitary" structure, the graph is more hierarchical.

2) Hierarchy: To be a pure out-tree, there can be no reciprocated ties. Reciprocal relations between two actors imply equal status, and this denies pure hierarchy. We can assess the degree of deviation from pure hierarchy by counting the number of pairs that have reciprocated ties relative to the number of pairs where there is any tie; that is, what proportion of all tied pairs have reciprocated ties.

3) Efficiency: To be a pure out-tree each node must have an in-degree of one. That is, each actor (except the ultimate boss) has a single boss. This aspect of the ideal type is termed "efficiency" because structures with multiple bosses have unnecessary redundant communication of orders from superiors to subordinates. The amount of deviation from this aspect of the pure out-tree can be measured by counting the difference between the actual number of links (minus 1, since the ultimate boss has no boss) and the maximum possible number of links. The bigger the difference, the greater the inefficiency. This dimension then measures the extent to which actors have a "single boss".

4) Least upper bound (LUB): To be a pure out-tree, each pair of actors (except pairs formed between the ultimate boss and others) must have an actor that directs ties to both - that is, command must be unified. The deviation of a graph from this condition can be measured by counting the numbers of pairs of actors that do not have a common boss relative to the number of pairs that could (which depends on the number of actors and the span of control of the ultimate boss).

The Network>Network Properties>Krackhardt GTD algorithms calculate indexes of each of the four dimensions, where higher scores indicate greater hierarchy. Figure 8.13 shows the results for the Knoke information network.

Figure 8.13: Output of Network>Network Properties>Krackhardt GTD for Knoke information network

The information network does form a single component, as there is at least one actor that can reach all others. So, the first dimension of pure hierarchy - that all the actors be embedded in a single structure - is satisfied. The ties in the information exchange network, though, are very likely to be reciprocal (at least insofar as they can be, given the limitations of the density). There are a number of nodes that receive information from multiple others, so the network is not "efficient". The least upper bound measure (the extent to which all actors have a boss in common) reports a value of 1.25, which would appear to be out of range and, frankly, is a puzzle.


Hierarchy Measure for Complex Networks

Nature, technology and society are full of complexity arising from the intricate web of the interactions among the units of the related systems (e.g., proteins, computers, people). Consequently, one of the most successful recent approaches to capturing the fundamental features of the structure and dynamics of complex systems has been the investigation of the networks associated with the above units (nodes) together with their relations (edges).

Most complex systems have an inherently hierarchical organization and, correspondingly, the networks behind them also exhibit hierarchical features. Indeed, several papers have been devoted to describing this essential aspect of networks, however, without resulting in a widely accepted, converging concept concerning the quantitative characterization of the level of their hierarchy.

Here we develop an approach and propose a quantity (measure) which is simple enough to be widely applicable, reveals a number of universal features of the organization of real-world networks and, as we demonstrate, is capable of capturing the essential features of the structure and the degree of hierarchy in a complex network. The measure we introduce is based on a generalization of the m-reach centrality, which we first extend to directed/partially directed graphs. Then, we define the global reaching centrality (GRC), which is the difference between the maximum and the average value of the generalized reach centralities over the network.

We investigate the behavior of the GRC considering both a synthetic model with an adjustable level of hierarchy and real networks. Results for real networks show that our hierarchy measure is related to the controllability of the given system. We also propose a visualization procedure for large complex networks that can be used to obtain an overall qualitative picture about the nature of their hierarchical structure.

Citation: Mones E, Vicsek L, Vicsek T (2012) Hierarchy Measure for Complex Networks. PLoS ONE 7(3): e33799. https://doi.org/10.1371/journal.pone.0033799

Editor: Stefano Boccaletti, Technical University of Madrid, Italy

Received: January 2, 2012 Accepted: February 17, 2012 Published: March 28, 2012

Copyright: © 2012 Mones et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: This work was supported by the EU FP7 COLLMOT Grant No: 227878. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.


Computation Base Ten

Add within 10 using objects or drawings.

Sample Question—click to preview:

Add within 5. (no objects or drawings).

Sample Question—click to preview:

Subtract within 10 using objects or drawings.

Sample Question—click to preview:

Add two 1 digit numbers, sum > 10.

Sample Question—click to preview:

Add a 2-digit number and a 1-digit number, sum within 100, no regrouping.

Sample Question—click to preview:

Add a 2-digit number and a multiple of ten, within 100, no regrouping.

Sample Question—click to preview:

Add two 2-digit numbers, sum within 100, no regrouping.

Sample Question—click to preview:

Subtract multiple of ten from multiple of ten, up to 90.

Sample Question—click to preview:

Subtract a 1-digit number from a 2-digit number, no regrouping.

Sample Question—click to preview:

Subtract multiple of ten from a 2-digit number, within 100, no regrouping.

Sample Question—click to preview:

Subtract within 100, no regrouping.

Sample Question—click to preview:

Sample Question—click to preview:

Sample Question—click to preview:

Add up to four 2-digit numbers, with regrouping.

Sample Question—click to preview:

Match objects in rectangular arrays to an expression showing repeated addition of equal addends.

Sample Question—click to preview:

Multiply 1-digit by 1-digit.

Sample Question—click to preview:

Sample Question—click to preview:

Sample Question—click to preview:

Multiply 1-digit by multiple of 10 (10-90).

Sample Question—click to preview:

Sample Question—click to preview:

Add multi-digit whole numbers > 1000.

Sample Question—click to preview:

Subtract multi-digit whole numbers > 1000.

Sample Question—click to preview:

Multiply whole number up to 4 digits by a 1-digit whole number.

Sample Question—click to preview:

Find whole-number quotients with up to four-digit dividends and one-digit divisors, no remainders.

Sample Question—click to preview:

Find whole-number quotients and remainders with up to four-digit dividends and one-digit divisors.

Sample Question—click to preview:

Multiply a 2-digit whole number by a 2-digit whole number.

Sample Question—click to preview:

Locate decimals to hundredths on a number line.

Sample Question—click to preview:

Use decimal notation for fractions with denominators 10 or 100.

Sample Question—click to preview:

Multiply multi-digit whole numbers (3 or more digits by 2 or more digits).

Sample Question—click to preview:

Find whole-number quotients of whole numbers with up to 4-digit dividends and 2-digit divisors.

Sample Question—click to preview:

Evaluate numerical expressions that use grouping symbols.

Sample Question—click to preview:

Add decimals to hundredths.

Sample Question—click to preview:

Subtract decimals to hundredths.

Sample Question—click to preview:

Multiply decimals to hundredths.

Sample Question—click to preview:

Divide decimals to hundredths.

Sample Question—click to preview:

Find a percent of a quantity as a rate per 100.

Sample Question—click to preview:

Divide multi-digit numbers.

Sample Question—click to preview:

Sample Question—click to preview:

Subtract multi-digit decimals.

Sample Question—click to preview:

Multiply multi-digit decimals.

Sample Question—click to preview:

Divide multi-digit decimals.

Sample Question—click to preview:

Determine the whole when given a part and percent, in a context.

Sample Question—click to preview:

Evaluate numerical expressions involving whole-number bases with whole-number exponents.

Sample Question—click to preview:

Calculate the square of a decimal.

Sample Question—click to preview:

Calculate the cube of a decimal.

Sample Question—click to preview:

Rewrite the sum of two whole numbers by using the distributive property to factor out the greatest common factor.

Sample Question—click to preview:

Add two integers (both negative, or one positive and one negative).

Sample Question—click to preview:

Sample Question—click to preview:

Add and subtract three or more integers with various signs.

Sample Question—click to preview:

Multiply two integers (both negative, or one positive and one negative).

Sample Question—click to preview:

Divide two integers (both negative, or one positive and one negative).

Sample Question—click to preview:

Sample Question—click to preview:

Identify the prime factorization of a given whole number.

Sample Question—click to preview:

Sample Question—click to preview:

Sample Question—click to preview:

Sample Question—click to preview:

Sample Question—click to preview:

Sample Question—click to preview:

Sample Question—click to preview:

Sample Question—click to preview:

Find square roots of small perfect squares of whole numbers.

Sample Question—click to preview:

Find cube roots of small perfect cubes of whole numbers.

Sample Question—click to preview:

Sample Question—click to preview:

Find square roots of small perfect squares of positive rational numbers in decimal form.

Sample Question—click to preview:

Find cube roots of small perfect cubes of positive rational numbers in decimal form.

Sample Question—click to preview:

Evaluate complex numerical expressions involving signed numbers, decimals, integer exponents, etc.

Sample Question—click to preview:


Abstract

Social network analysis is an increasingly popular sociological method used to describe and understand the social aspects of communication patterns in the health care sector. The networks studied in this area are special because they are small, and for these sizes, the metrics calculated during analysis are sensitive to the number of people in the network and the density of observed communication. Validation is of particular value in controlling for these factors and in assisting in the accurate interpretation of network findings, yet such approaches are rarely applied. Our aim in this paper was to bring together published case studies to demonstrate how a proposed validation technique provides a basis for standardised comparison of networks within and across studies. A validation is performed for three network studies comprising ten networks, where the results are compared within and across the studies in relation to a standard baseline. The results confirm that hierarchy, centralisation and clustering metrics are highly sensitive to changes in size or density. Amongst the three case studies, we found support for some conclusions and contrary evidence for others. This validation approach is a tool for identifying additional features and verifying the conclusions reached in observational studies of small networks. We provide a methodological basis from which to perform intra-study and inter-study comparisons, for the purpose of introducing greater rigour to the use of social network analysis in health care applications.

Highlights

► Social network analysis of small networks is problematic when comparing networks of different sizes or densities. ► Network density and size confound aggregate network metrics, which leads to misinterpretation in published case studies. ► Comparison across networks of different size or density is possible using a simulated random-network baseline. ► The method is a way of comparing the strength of influence that social constraints have on patterns of communication.


Types of Graphs and Charts And Their Uses

If you are wondering what are the different types of graphs and charts , their uses and names, this page summarizes them with examples and pictures.

As the different kinds of graphs aim to represent data, they are used in many areas such as: in statistics, in data science, in math, in economics, in business and etc.


Every type of graph is a visual representation of data on diagram plots (ex. bar, pie, line chart) that show different types of graph trends and relationships between variables.

Although it is hard to tell what are all the types of graphs, this page consists all of the common types of statistical graphs and charts (and their meanings) widely used in any science.

1. Line Graphs

A line chart graphically displays data that changes continuously over time. Each line graph consists of points that connect data to show a trend (continuous change). Line graphs have an x-axis and a y-axis. In the most cases, time is distributed on the horizontal axis.

  • When you want to show trends. For example, how house prices have increased over time.
  • When you want to make predictions based on a data history over time.
  • When comparing two or more different variables, situations, and information over a given period of time.

The following line graph shows annual sales of a particular business company for the period of six consecutive years:

Note: the above example is with 1 line. However, one line chart can compare multiple trends by several distributing lines.

2. Bar Charts

Bar charts represent categorical data with rectangular bars (to understand what is categorical data see categorical data examples). Bar graphs are among the most popular types of graphs and charts in economics, statistics, marketing, and visualization in digital customer experience. They are commonly used to compare several categories of data.

Each rectangular bar has length and height proportional to the values that they represent.

One axis of the bar chart presents the categories being compared. The other axis shows a measured value.

  • When you want to display data that are grouped into nominal or ordinal categories (see nominal vs ordinal data).
  • To compare data among different categories.
  • Bar charts can also show largedata changes over time.
  • Bar charts are ideal for visualizing the distribution of data when we have more than three categories.

The bar chart below represents the total sum of sales for Product A and Product B over three years.

The bars are 2 types: vertical or horizontal. It doesn’t matter which kind you will use. The above one is a vertical type.

3. Pie Charts

When it comes to statistical types of graphs and charts, the pie chart (or the circle chart) has a crucial place and meaning. It displays data and statistics in an easy-to-understand ‘pie-slice’ format and illustrates numerical proportion.

Each pie slice is relative to the size of a particular category in a given group as a whole. To say it in another way, the pie chart brakes down a group into smaller pieces. It shows part-whole relationships.

To make a pie chart, you need a list of categorical variables and numerical variables.

  • When you want to create and represent the composition of something.
  • It is very useful for displaying nominal or ordinal categories of data.
  • To show percentage or proportional data.
  • When comparing areas of growth within a business such as profit.
  • Pie charts work best for displaying data for 3 to 7 categories.

The pie chart below represents the proportion of types of transportation used by 1000 students to go to their school.

Pie charts are widely used by data-driven marketers for displaying marketing data.

4. Histogram

A histogram shows continuous data in ordered rectangular columns (to understand what is continuous data see our post discrete vs continuous data). Usually, there are no gaps between the columns.

The histogram displays a frequency distribution (shape) of a data set. At first glance, histograms look alike to bar graphs. However, there is a key difference between them. Bar Chart represents categorical data and histogram represent continuous data.

  • When the data is continuous.
  • When you want to represent the shape of the data’s distribution.
  • When you want to see whether the outputs of two or more processes are different.
  • To summarize large data sets graphically.
  • To communicate the data distribution quickly to others.

The histogram below represents per capita income for five age groups.

Histograms are very widely used in statistics, business, and economics.

5. Scatter plot

The scatter plot is an X-Y diagram that shows a relationship between two variables. It is used to plot data points on a vertical and a horizontal axis. The purpose is to show how much one variable affects another.

Usually, when there is a relationship between 2 variables, the first one is called independent. The second variable is called dependent because its values depend on the first variable.

Scatter plots also help you predict the behavior of one variable (dependent) based on the measure of the other variable (independent).

  • When trying to find out whether there is a relationship between 2 variables.
  • To predict the behavior of dependent variable based on the measure of the independent variable.
  • When having paired numerical data.
  • When working with root cause analysis tools to identify the potential for problems.
  • When you just want to visualize the correlation between 2 large datasets without regard to time.

The below Scatter plot presents data for 7 online stores, their monthly e-commerce sales, and online advertising costs for the last year.

The orange line you see in the plot is called “line of best fit” or a “trend line”. This line is used to help us make predictions that are based on past data.

The Scatter plots are used widely in data science and statistics. They are a great tool for visualizing linear regression models.

More examples and explanation for scatter plots you can see in our post what does a scatter plot show and simple linear regression examples.

6. Venn Chart

Venn Diagram (also called primary diagram, set diagram or logic diagrams) uses overlapping circles to visualize the logical relationships between two or more group of items.

Venn Diagram is one of the types of graphs and charts used in scientific and engineering presentations, in computer applications, in maths, and in statistics.

The basic structure of the Venn diagram is usually overlapping circles. The items in the overlapping section have specific common characteristics. Items in the outer portions of the circles do not have common traits.

  • When you want to compare and contrast groups of things.
  • To categorize or group items.
  • To illustrate logical relationships from various datasets.
  • To identify all the possible relationships between collections of datasets.

The following science example of Venn diagram compares the features of birds and bats.

7. Area Charts

Area charts show the change in one or several quantities over time. They are very similar to the line chart. However, the area between axis and line are usually filled with colors.


Despite line and area charts support the same type of analysis, they cannot be always used interchangeably. Line charts are often used to represent multiple data sets. Area charts cannot show multiple data sets clearly because area charts show a filled area below the line.

  • When you want to show trends, rather than express specific values.
  • To show a simple comparison of the trend of data sets over the period of time.
  • To display the magnitude of a change.
  • To compare a small number of categories.

The area chart has 2 variants: a variant with data plots overlapping each other and a variant with data plots stacked on top of each other (known as stacked area chart – as the shown in the following example).

The area chart below shows quarterly sales for product categories A and B for the last year.

This area chart shows you a quick comparison of the trend in the quarterly sales of Product A and Product B over the period of the last year.

8. Spline Chart

The Spline Chart is one of the most widespread types of graphs and charts used in statistics. It is a form of the line chart that represent smooth curves through the different data points.

Spline charts possess all the characteristics of a line chart except that spline charts have a fitted curved line to join the data points. In comparison, line charts connect data points with straight lines.

  • When you want to plot data that requires the usage of curve-fitting such as a product lifecycle chart or an impulse-response chart.
  • Spline charts are often used in designing Pareto charts.
  • Spline chart also is often used for data modeling by when you have limited number of data points and estimating the intervening values.

The following spline chart example shows sales of a company through several months of a year:

9. Box and Whisker Chart

A box and whisker chart is a statistical graph for displaying sets of numerical data through their quartiles. It displays a frequency distribution of the data.

The box and whisker chart helps you to display the spread and skewness for a given set of data using the five number summary principle: minimum, maximum, median, lower and upper quartiles. The ‘five-number summary’ principle allows providing a statistical summary for a particular set of numbers. It shows you the range (minimum and maximum numbers), the spread (upper and lower quartiles), and the center (median) for the set of data numbers.

A very simple figure of a box and whisker plot you can see below:

Box and Whisker Chart Uses:

  • When you want to observe the upper, lower quartiles, mean, median, deviations, etc. for a large set of data.
  • When you want to see a quick view of the dataset distribution.
  • When you have multiple data sets that come from independent sources and relate to each other in some way.
  • When you need to compare data from different categories.

The table and box-and-whisker plots below shows test scores for Maths and Literature for the same class.

Maths3577924355667370
Literature3543404350607092

Box and Whisker charts have applications in many scientific areas and types of analysis such as statistical analysis, test results analysis, marketing analysis, data analysis, and etc.

10. Bubble Chart

Bubble charts are super useful types of graphs for making a comparison of the relationships between data in 3 numeric-data dimensions: the Y-axis data, the X-axis data, and data depicting the bubble size.

Bubble charts are very similar to XY Scatter plots but the bubble chart adds more functionality – a third dimension of data that can be extremely valuable.

Both axes (X and Y) of a bubble chart are numeric.

  • When you have to display three or four dimensions of data.
  • When you want to compare and display the relationships between categorized circles, by the use of proportions.

The bubble chart below shows the relationship between Cost (X-Axis), Profit (Y-Axis), and Probability of Success (%) (Bubble Size).

11. Pictographs

The pictograph or a pictogram is one of the more visually appealing types of graphs and charts that display numerical information with the use of icons or picture symbols to represent data sets.

They are very easy to read statistical way of data visualization. A pictogram shows the frequency of data as images or symbols. Each image/symbol may represent one or more units of a given dataset.

  • When your audience prefers and understands better displays that include icons and illustrations. Fun can promote learning.
  • It’s habitual for infographics to use of a pictogram.
  • When you want to compare two points in an emotionally powerful way.

The following pictographic represents the number of computers sold by a business company for the period from January to March.

The pictographic example above shows that in January are sold 20 computers (4࡫ = 20), in February are sold 30 computers (6࡫ = 30) and in March are sold 15 computers.

12. Dot Plot

Dot plot or dot graph is just one of the many types of graphs and charts to organize statistical data. It uses dots to represent data. A Dot Plot is used for relatively small sets of data and the values fall into a number of discrete categories.

If a value appears more than one time, the dots are ordered one above the other. That way the column height of dots shows the frequency for that value.

  • To plot frequency counts when you have a small number of categories.
  • Dot plots are very useful when the variable is quantitative or categorical.
  • Dot graphs are also used for univariate data (data with only one variable that you can measure).

Suppose you have a class of 26 students. They are asked to tell their favorite color. The dot plot below represents their choices:

It is obvious that blue is the most preferred color by the students in this class.

13. Radar Chart

A radar chart is one of the most modern types of graphs and charts – ideal for multiple comparisons. Radar charts use a circular display with several different quantitative axes looking like spokes on a wheel. Each axis shows a quantity for a different categorical value.

Radar charts are also known as spider charts, web charts, star plots, irregular polygons, polar charts, cobweb charts or Kiviat diagram.

Radar Chart has many applications nowadays in statistics, maths, business, sports analysis, data intelligence, and etc.

  • When you want to observe which variables have similar values or whether there are any outliers amongst each variable.
  • To represent multiple comparisons.
  • When you want to see which variables are scoring low or high within a dataset. This makes radar chart ideal for displaying performance.

For example, we can compare employee’s performance with the scale of 1-8 on subjects such as Punctuality, Problem-solving, Meeting Deadlines, Marketing Knowledge, Communications. A point that is closer to the center on an axis shows a lower value and a worse performance.

LabelPunctualityProblem-solvingMeeting DeadlinesMarketing KnowledgeCommunications
Jane65878
Samanta75548

It is obvious that Jane has a better performance than Samanta.

14. Pyramid Graph

When it comes to easy to understand and good looking types of graphs and charts, pyramid graph has a top place.

A pyramid graph is a chart in a pyramid shape or triangle shape. These types of charts are best for data that is organized in some kind of hierarchy. The levels show a progressive order.

  • When you want to indicate a hierarchy level among the topics or other types of data.
  • Pyramid graph is often used to represent progressive orders such as: “older to newer”, “more important to least important”, “specific to least specific”‘ and etc.
  • When you have a proportional or interconnected relationship between data sets.

A classic pyramid graph example is the healthy food pyramid that shows fats, oils, and sugar (at the top) should be eaten less than many other foods such as vegetables and fruits (at the bottom of the pyramid).

You might know that choosing the right type of chart is some kind of tricky business.

Practically, the choice depends on 2 major things: on the kind of analysis you are you want to perform and on the type of data you have.


Commonly, when we aim to facilitate a comparison, we use a bar chart or radar chart. When we want to show trends over time, we use a line chart or an area chart and etc.

Anyway, you have a wide choice of types of graphs and charts. Used in the right way, they are a powerful weapon to help you make your reports and presentations both professional and clear.

What are your favorite types of graphs and charts? Share your thoughts on the field below.

About The Author

Silvia Valcheva

Silvia Valcheva is a digital marketer with over a decade of experience creating content for the tech industry. She has a strong passion for writing about emerging software and technologies such as big data, AI (Artificial Intelligence), IoT (Internet of Things), process automation, etc.


Computational Complexity: A Quantitative Perspective

5.7 Hard predicates

IN BRIEF: Given a hard function, one can effectively construct a hard predicate. The type of hardness (crypto hardness or exponential hardness) is preserved.

Predicate functions are particularly interesting because they model decision problems, i.e., languages, which are objects of major interest in computational complexity. Also, the construction of the type II pseudo-random generators that will be presented in the next section utilizes hard predicates as the starting point. Consequently, we focus in this section on hard predicates. Recall that a predicate is a function whose outputs can only be 0 or 1. Clearly, any predicate f can be calculated correctly on a fraction of 1 2 of the inputs at each length by quite simple circuits. Indeed, either the circuit that outputs 1 on all inputs, or the circuit that outputs 0 on all inputs, will agree with f on least half of the inputs. Therefore, when considering the performance of a circuit that attempts to calculate a predicate only the bias from 1 2 is relevant. Accordingly, we give the following definitions for the general form of a hard predicate as well as for two particular forms of strong hardness for predicates.

((∈, S)-hard predicate) A predicate f: Σ ℓ → <0, 1>is (∈, S)-hard if for every circuit C of size S,

(Crypto-hard predicate) A predicate f: Σ * → <0, 1>is crypto-hard if there is a superpolynomial function S so that the following holds. For any polynomial p and for any family of circuits (C)ℓ∈ℕ of size at most S,

(Exponentially-hard predicate) A predicate f: Σ * → <0, 1>is exponentially-hard if there is a constant c so that for any family of circuits (C)ℓ∈ℕ with size(C) ≤ 2 cl , for all l,

The next result shows that a length-preserving crypto-hard (exponential hard) function can be converted into a crypto-hard (exponential hard, respectively) predicate.

Let f: Σ * → Σ * be a length- preserving function which is crypto-hard. Then there exists f′: Σ * → <0, 1>a crypto-hard predicate. Moreover f' can he constructed effectively from f in polynomial time.

Let f : Σ * → Σ * be a length-preserving function such that for some positive constant c and for all sufficiently large l, f is (2 −cl , 2 cl )-hard. Then there exists f′: Σ * → <0, 1>an exponentially-hard predicate. Moreover f' can be constructed effectively from f in polynomial-time.

Proof. We prove (a). Let p(ℓ) be an arbitrary polynomial and let s(ℓ) be a superpolynomial function so that for any circuit C of size s(ℓ), for ℓ sufficiently large,

We will be using once again error-correcting codes. This time we will utilize the Hadamard error-correcting code and we will take advantage of its list-decoding property (see Theorem 5.3.5 ). Namely, for f(x) of length ℓ, we consider Had(f(x)) and define the predicate on inputs x and r to be the r-th bit of Had(f(x)). If a circuit can calculate correctly at least a fraction of 1 2 + 1 p ( ℓ ) of the bits of Had(f(x)), then, by Theorem 5.3.5 , we can produce a short list which contains f(x). By picking randomly one element from this list, we have a fairly good chance of retrieving f(x). This contradicts the hardness of f.

We now proceed with the formal proof. Let

Clearly, given oracle access to f, f′ can be calculated in polynomial time. Assume that there is a circuit C of size s1(ℓ) such that

Then a small variation of the above relation holds for a polynomial fraction of fixed x. Indeed, let B = < x ∈ Σ ℓ | Prob r ( C ( x , r ) = f ′ ( x , r ) ) ≥ 1 2 + 1 2 p ( ℓ ) >.

Therefore, for any xB, C(x, r) = f(x) · r for at least a fraction of 1 2 + 1 2 p ( ℓ ) of the r in Σ ℓ . In other words, if C ¯ denotes the string C(x, 0 … 0) … C(x, 1 … 1) of length ℓ · 2 ℓ , then the Hamming distance between C ¯ and Had(f(x)) is at most 1 2 − 1 2 · p ( ℓ ) . By Theorem 5.3.5 , there is an oracle probabilistic circuit A′ of size O ( ℓ 3 · 1 ( 2 p ( ℓ ) ) 4 ) that makes ℓ 2 · (2p(ℓ)) 2 queries to C (formally to the oracle string C ¯ ) with the following property: For every xB, with probability at least 3 4 , A′ outputs a list of ℓ · (2p(ℓ)) 2 + 1 strings which includes f(x). By embedding the circuit C into A′, we get a probabilistic circuit A of size bounded by O(ℓ 3 · (p(ℓ)) 4 ) · size(C) that, for all xB, outputs a list as above. We further modify the circuit A so that at the end it randomly picks one element from this list. With probability at least 1 4 ℓ · ( p ( ℓ ) ) 2 + 1 , this element is f(x).

Thus, the probability that the modified A on input x computes f(x) is at least

(Here, rand denotes the random bits used by the circuit A). The size of A is bounded by s(ℓ) (for an appropriate choice of the constant c in Equation (5.15) ). Since the polynomial p(ℓ) is arbitrary and s1(ℓ) is superpolynomial, it follows that f′ is a crypto-hard predicate.

The proof of (b) is virtually identical.


ABSTRACT

There are many different types of species distribution models (SDMs) that are widely used in the field of ecology. In this research, we explored a new advanced mechanism for predicting the distribution of species based on fuzzy membership function, principle of maximum entropy, fuzzy mathematics comprehensive evaluation, and the framework of Bayesian networks. We use fuzzy mathematics and Bayesian network model (FBM) to simulate relationships between species’ habitats and environmental variables, and the relationship may be difficult to quantify effectively. FBM, which combines species data, environmental data, expert experience, and machine learning, could reduce the data and system error. In the case of medicinal plant, Angelica sinensis (Oliv.) Diels, many approaches have been applied, including nine learning sequence of sampling sites, three FBM models, two types of information classification by fuzzy mathematical classification (FMC) and equal interval classification (EIC), and the evaluation of AIC and log-likelihood. Through the comparison of reasoning results between FBM and fuzzy matter element model (FME) in testing sites, the result shows that the combination of objective data and empirical model structure makes FBM have better result output. Besides, FBM sensitivity analysis helps researchers explore in detail the impact of environmental factors on each level of species habitat suitability. The temperature factor has an important influence on the highly suitable, moderately suitable, and lowly suitable habitats of A. sinensis. Through FMC and sensitivity analysis, annual mean temperature (Bio1) in 5.92 °C-9.05 °C and mean temperature of warmest quarter (Bio10) in 14.80 °C-18.60 °C are the highly suitable habitat temperature range of A. sinensis.


Data Structure and Algorithms - Quick Sort

Quick sort is a highly efficient sorting algorithm and is based on partitioning of array of data into smaller arrays. A large array is partitioned into two arrays one of which holds values smaller than the specified value, say pivot, based on which the partition is made and another array holds values greater than the pivot value.

Quicksort partitions an array and then calls itself recursively twice to sort the two resulting subarrays. This algorithm is quite efficient for large-sized data sets as its average and worst-case complexity are O(n 2 ), respectively.


A Learning Progression for Variability

While the authors we discussed developed their own descriptors of the levels many students pass through as they gain an understanding of variability, there is a good deal of commonality among these ideas. And while many of these authors are based in Australia and conducted their research with Australian students, the mathematics standards in the Australian Curriculum (Assessment and Reporting Authority, 2010 ) are sufficiently similar to the CCSSM that we can combine their research with the research of the US-based experts to develop our own LP for variability see Table 2. This hypothetical LP has five levels the first four are based on the work of Shaughnessy et al. ( 1999 ), Torok and Watson ( 2000 ), Watson et al. ( 2003 ), Reading and Shaughnessy ( 2000 , 2004 ), and others. For three of these levels, we have taken the names given by Watson et al. for Level 1, we thought “Naive Understanding of Variability” was a more descriptive title than Watson et al.'s “Prerequisites for Variation.” Our Level 5 incorporates the work of Peters ( 2011 ) on robust understanding of variability. The full LP is presented in Table 2 an overview of the five levels follows:

Robust understanding of variability

University students and in-service teachers

  • Acknowledge the existence of variability and the need for study design.
  • Anticipate reasonable variability in data.
  • Anticipate and allow for reasonable variability in data when using models.
  • Use context to consider sources and types of variability to inform and critique study design.
  • Describe and measure variability in data for contextual variables as part of exploratory data analysis.
  • Identify the pattern of variability in data or the expected pattern of variability for contextual variables.
  • Control variability when designing studies or critique the extent to which variability was controlled in studies.
  • Explore controlled and random variability to infer relationships among data and variables.
  • Model controlled or random variability in data, transformed data, or simple statistics.
  • Anticipate the effects of sample size when designing a study or critiquing a study design.
  • Examine the effects of sample size through the creation, use, or interpretation of data-based graphical or numerical representations.
  • Anticipate the effects of sample size on the variability of a sampling distribution.
  • The distribution of scores assigned by each graduate student
  • The standard deviation of the scores assigned by each graduate student
  • The range of scores assigned by each graduate student
  • Whether or not the papers were assigned at random to the graduate students for grading
  • The number of scores at each score point assigned by each graduate student

Critical aspects of variability

  • Students at this level exhibit proficiency in drafting arguments based on proportional reasoning. For example, when asked to predict the number of red candies in a sample drawn from a known population, they would focus their attention on the proportion of red candies instead of the number as a result, they would be likely to make appropriate predictions about both the center and the spread of a distribution.
  • Students will demonstrate in their responses an appropriate degree of variation with an appropriate balance between variation and clustering.
  • Students understand the importance of variability as well as the importance of central tendency in characterizing a data set.
  • Students are able to give sophisticated definitions of technical terms such as sample, random, and variation (Watson et al., 2003 ).
  • Students can give a statistically appropriate analysis of graphs they can generate graphical displays of data and use them to compare two or more distributions (Ben-Zvi, 2004 Watson et al., 2003 ).
  • Students are likely to detect sources of bias in samples, such as nonrepresentativeness.
  • Students can describe a distribution in terms of deviations from a central value (mean, median, or modeReading, 2004 Reading & Shaughnessy, 2000 , 2004 Watson et al., 2003 ).
  • Students have a conceptual understanding of standard deviation as a measure of variability.
  • Students can translate between graphical, numeric, and symbolic representations of distributions.

The following histograms represent the age distributions of the two countries. b b Adapted from New York State Education Department ( 2013 ). CC BY-NC-SS.

  1. How do the shapes of the two histograms differ?
  2. Approximately what percentage of people in Kenya in 2010 were between the ages of 0 and 10 years?
  3. Approximately what percentage of people in the United States in 2010 were between the ages of 0 and 10 years?
  4. Approximately what percentage of people in Kenya in 2010 were between the ages of 70 and 100 years?
  5. Approximately what percentage of people in the United States in 2010 were between 70 and 100 years?
  6. The population of Kenya in 2010 was approximately 41 million people. Approximately how many people in Kenya were between the ages of 0 and 10 years? Between 60 and 100 years?
  7. If you had visited a city in Kenya in 2010, do you think you would likely have seen many teenagers? Would you likely have seen many people over 70 years old? Explain your answers based on the histogram. b b Adapted from New York State Education Department ( 2013 ). CC BY-NC-SS.

Applications of variability

  • Students at this level will have some facility with proportional reasoning but will sometimes produce responses with too much or too little variation the range may be insufficiently clustered around the mean or may be too tightly clustered around the mean (Torok & Watson, 2000 Watson et al., 2003 ).
  • Students at this level can calculate the measures of central tendency (mean, median, and mode) and spread (range and mean absolute deviation) but may not appreciate the importance of variation. Their understanding of these measures is procedural and not conceptual they do not understand what the measures mean or what the distinction is between measures of center and measures of spread (Ben-Zvi, 2004 delMas & Liu, 2005 Watson et al., 2003 ).
  • Students at this level may attempt to define technical terms like sample, random, and variation, but they may rely on examples to explain the meaning of the terms (Watson et al., 2003 ).
  • Students may be able to provide a partial analysis of graphs while missing overall trends (Watson et al., 2003 ).
  • When selecting samples, students may focus on representativeness or randomness, but not both.
  • A distribution may be described in terms of deviations from an anchor value that is not a central value (Reading, 2004 Reading & Shaughnessy, 2000 , 2004 Watson et al., 2003 ).
  • Students may not understand what an outlier is, thinking that an outlier is the least frequent value.

Below are the heights of the players on the University of Maryland women's basketball team for the 2012–2013 season and the heights of the players on the women's field hockey team for the 2012 season.

  1. Based on visual inspection of the data, which group appears to have the larger average height? Which group appears to have the greater variability in the heights?
  2. Compute the mean and mean absolute deviation (MAD) for each group. Do these values support your answers in part (a)?
  3. How many of the 12 basketball players are shorter than the tallest field hockey player?
  4. Imagine that an athlete from one of the two teams told you she needs to go to practice. You estimate that she is about 65 inches tall. If you had to pick, would you think that she was a field hockey player or that she was a basketball player? Explain your reasoning.
  5. The women on the Maryland field hockey team are not a random sample of all female college field hockey players. Similarly, the women on the Maryland basketball team are not a random sample of all female college basketball players. However, for purposes of this task, suppose that these two groups can be regarded as random samples of all female college field hockey players and all female college basketball players, respectively. If these were random samples, would you think that female college basketball players are typically taller than female college field hockey players? Explain your decision using answers to the previous questions and/or additional analysis.

Partial recognition of variability

  • Students at this level are likely to have difficulty with proportional reasoning. For example, they may focus on the number of red candies in a population instead of the proportion of red candies. As a result, they may overestimate the number of red candies (Torok & Watson, 2000 ).
  • Students are likely to use unquantified chance statements to describe outcomes, such as “Anything can happen” (Watson et al., 2003 ).
  • Students may interpret the pattern in a distribution but may not make a specific reference to the variation exhibited in the distribution.
  • Terms like sample, random, and variation are likely to be familiar, but students may not understand the precise mathematical meanings of these terms or may have difficulty expressing the mathematically accurate meanings of these terms in words (Watson et al., 2003 ).
  • Students may make statements that informally account for variability. They may acknowledge variation by giving a range of numbers in response to a question instead of a specific value (Torok & Watson, 2000 ).
  • Students will describe features of a distribution numerically. They can support their claims with references to specific features in the data (Reading, 2004 ).
  • At this level, students will have difficulty describing a particular distribution.
  • Students' reasons do not reflect understanding of chance or variation.
  • Students' explanation of variability as displayed in graphs is apt to be flawed (Watson et al., 2003 , p. 23).
  • Students at this level can calculate mean absolute deviation and understand that it measures the variability in a population (NGA/CCSSO, 2010).

1. Suppose we have a bowl with 100 candies in it. 20 are yellow, 50 are red, and 30 are blue. Suppose you pick out 10 candies.

How many reds do you expect to get? ___

Would this happen every time? Why?

2. Altogether six of you do this experiment.

What do you think is likely to occur for the numbers of red candies that are written down?

_____, _____, _____, _____, _____, _____

Why are these likely numbers for the reds?

  1. 5, 9, 7, 6, 8, 7
  2. 3, 7, 5, 8, 5, 4
  3. 5, 5, 5, 5, 5, 5
  4. 2, 3, 4, 3, 4, 4
  5. 7, 7, 7, 7, 7, 7
  6. 3, 0, 9, 2, 8, 5
  7. 10, 10, 10, 10, 10, 10

Why do you think the list you chose best describes what might happen?

4. Suppose that 6 students did the experiment—pulled out 10 candies from this bowl, wrote down the number of reds, put them back, mixed them up.

What do you think the numbers will most likely go from? From _____ (low) to _____ (high) number of reds.

Naïve understanding of variability

  • Students at this level may reason from factors unrelated to the distribution, they may attempt to justify their responses with stories about personal experiences, or they may focus on irrelevant aspects of the data (Ben-Zvi, 2004 Torok & Watson, 2000 Watson et al., 2003 ).
  • Students may overpredict variability, perhaps because they think that all possible outcomes are equally likely they have the idea that “anything can happen” in a chance experiment (Torok & Watson, 2000 Watson et al., 2003 ).
  • Students recognize variation only in the simple context of “not looking the same every day” or in describing a surprising outcome (Watson et al., 2003 , p. 11).
  • Students at this level will have a poor knowledge of technical terms associated with variation (Torok & Watson, 2000 ).
  • Students may make informal statements that do not take into account the variability in the data.
  • Students may describe features of a distribution in words and not numerically (Reading, 2004 ).
  • Students at this level may confuse different measures of central tendency, such as mean and mode (Torok & Watson, 2000 ).
  • Students may be able to read information from a graph but have difficulty interpreting information obtained from a graph or integrating information obtained from several graphs (Watson et al., 2003 ).
  • Responses to questions involving chance are apt to be numerically inappropriate.
  • Students at this level will not have developed measures of variability.

1. A class used this spinner. Out of 50 spins, how many times do you think the spinner will land on the shaded part? Why do you think this?

2. A class of students recorded the number of years their families had lived in their town. Here are two graphs that students drew to tell their story (Watson et al., 2003 ).

a. What can you tell by looking at graph 1?

b. What can you tell by looking at graph 2?

c. Which graph tells the story better? Why?

  • a The descriptors and the task at this level are adapted from Peters ( 2011 ), figure 13. Copyright 2011 by International Association for Statistical Education.
  • b Adapted from New York State Education Department ( 2013 ). CC BY-NC-SS.

Level 5: Robust Understanding of Variability.

Level 4: Critical Aspects of Variability.

Level 3: Applications of Variability.

Level 2: Partial Recognition of Variability.

Level 1: Naive Understanding of Variability.

When aligned with the CCSSM, Levels 1–4 correspond to an understanding of variability appropriate for middle or high school students. Research by English and Watson ( 2016 ), however, has suggested that students as early as fourth grade can gain an understanding of variability through the administration of carefully planned tasks, and the GAISE report (Franklin et al., 2007 ) suggests that elementary school students can use the range of a set of data as a measure of its spread. Lehrer and Kim ( 2009 ) found that students in Grades 5 and 6, when engaged in modeling data, can recognize and devise measures of variability. Hence, in the proposed LP, we have aligned Level 1 with grades earlier than Grade 6. Level 2 aligns with Grade 6, Level 3 aligns with Grade 7, and Level 4 aligns with high school. Level 5 corresponds to an understanding that might be approached by an advanced university statistics student or in-service statistics teacher. These alignments, however, are only intended to demonstrate how the levels of our LP correspond with standards in the CCSSM. One must also keep in mind the admonition in the GAISE report. A Grade 6 student who has had no previous instruction in statistics will begin at Level 1 before progressing to Level 2, and high school students will not be able to begin Level 4 until they have mastered Level 3 (Franklin et al., 2007 , p. 13).

Students with a Level 1 understanding in the proposed LP often provide arguments that are based on an idiosyncratic understanding of variability or on their previous experiences, and not on an analysis of the data. This was observed by Torok and Watson ( 2000 ), who remarked that students at their Level A often reason from factors unrelated to the distribution by Watson et al. ( 2003 ), who observed that Level 1 students are likely to justify responses with stories about their personal experiences and by Ben-Zvi ( 2004 ), for whom students at his Stage 1 focus on irrelevant aspects of the data.

But not all naive understandings are due to the use of irrelevant information. While Shaughnessy et al. ( 1999 ) did not construct developmental levels, they observed several stages in student understanding (or misunderstanding). In particular, they observed that some students expected a wide range of possible numbers of red lollies in a sample of 10 lollies, perhaps because students with a low level of understanding believe that the various possible outcomes are equally likely, that anything can happen in a chance experiment. This is consistent with Torok and Watson's ( 2000 ) observation that students at their Level A often predict too much variation, and it accounts in part for Watson et al.'s ( 2003 ) observation that students at their Level 1 give responses that are numerically inappropriate. Watson et al. also found that students at their Level 2 may make unquantified statements, such as “anything can happen,” to justify their conclusions.

Watson et al. ( 2003 ) found that students at Level 1 recognize variation only in the context of “not looking the same every day” (p. 11), while Torok and Watson ( 2000 ) observed that students at their Level A had a poor knowledge of technical terms associated with variation. They said that students at higher levels had a reasonable knowledge of technical terms, but Watson et al. ( 2003 ) clarified this by observing that students with a Level 2 understanding are familiar with technical terms (in particular, sample, random, and variation) but do not understand the meanings of these terms, whereas students with a Level 3 understanding attempt to give definitions of these terms but may resort to giving examples to explain their meanings. It is not until Level 4 that students can give sophisticated definitions of these terms without relying on examples (Watson et al., 2003 ).

Reading ( 2004 ) extended the Reading and Shaughnessy ( 2000 , 2004 ) description hierarchy to apply to data variability as well as sample variability. She discovered that Levels 1 and 2 of the Reading and Shaughnessy ( 2004 ) hierarchy could be reorganized into a qualitative level and a quantitative level the first corresponds to Watson et al.'s ( 2003 ) Level 1, and the second corresponds to Watson et al.'s Level 2. Students with a Level 1 understanding describe features of a distribution in words and not numerically, whereas students with a Level 2 understanding describe features numerically. This is consistent with the general description of Level 1 in our LP as representing a naive understanding of variability.

Difficulties or misconceptions common with students at Level 1 of our LP include the tendency to confuse the mean with the mode (thinking of the mean as the most common value Torok & Watson, 2000 ), difficulty interpreting tables and graphs (Watson et al., 2003 ), and a lack of conceptual understanding of standard deviation (delMas & Liu, 2005 ).

Shaughnessy et al. ( 1999 ) and Torok and Watson ( 2000 ) both observed that, in the lollies task, some students focused on the number of red lollies (50) rather than the proportion of lollies (one half). As a result, they tended to overpredict the number of red lollies. Torok and Watson placed these students in their Level B, corresponding to our Level 2. Ben-Zvi ( 2004 ) observed that students at his Stage 3 (out of seven stages) made informal attempts to account for variability. This corresponds to Torok and Watson's observation that students at their Level B acknowledge variation by providing a range of responses instead of specific values.

Students at Level 3 of our LP have some facility with the application of proportional reasoning to problems of variation but may produce responses with too much or too little variation. This observation is supported by the findings of Torok and Watson ( 2000 ) and Watson et al. ( 2003 ). Watson et al. also observed that Level 3 students can calculate the mean but do not appreciate the importance of variation. This is consistent with Ben-Zvi's ( 2004 ) Stage 5, in which students can calculate measures of center and spread but do not have a conceptual understanding of these measures. It is also consistent with delMas and Liu's ( 2005 ) finding that students at this level have a cursory and fragmented understanding of standard deviation.

Watson et al. ( 2003 ) found that students at Level 2 are likely to make flawed interpretations of graphs, students at Level 3 might provide a partial analysis of a graph while missing overall trends, and students at Level 4 are likely to make statistically appropriate analyses of graphs. This last finding is consistent with Ben-Zvi's ( 2004 ) final Stage 7, in which students generate graphical displays of data and use them to compare distributions.

Reading and Shaughnessy ( 2000 , 2004 ) distinguished between students who describe a distribution in terms of deviations from an anchor value that is not a central value and students who describe a distribution in terms of deviations from a central value. Reading ( 2004 ) identified the first group of students with Watson et al.'s ( 2003 ) Level 3 and the second group with Level 4, as we have done in our LP.

Finally, our Level 5 is adapted from Peters ( 2011 ) see Table 1. An understanding at this level constitutes an understanding of multiple aspects of variability and multiple contexts in which variability occurs. It also includes an understanding of the interconnections between variability and related concepts (Peters, 2011 ). It represents the level of understanding of variability that one would expect of advanced college students and in-service teachers.


Discussion and conclusions

We started this paper by highlighting the fact that competitive interactions between rival gangs often appear imbalanced. Some gangs are net exporters of violence (i.e., more often aggressors in homicides), while others are net importers (i.e., more often targets in homicides). It is reasonable to suppose that such imbalances in violence reflect imbalances in competitive ability since violence appears central to how gangs “jockey for positions of dominance” (Papachristos 2009, p. 76). Exactly how these dynamics unfold remains an open question, however, since we do not have formal expectations about how competitive dominance, gang size and directionality of violence should be related.

To rectify this situation, we turned to mathematical models first developed to deal with analogous problems observed in plant ecology (Tilman 1994). The key advantage of Tilman’s model is that it allows us to make strict assumptions about competitive dominance and follow those assumptions through to their empirical expectations. The key assumption is that a superior competitor can always displace an inferior competitor wherever they are encountered and always hold a site against any incursion by an inferior competitor. Under such conditions inferior competitors can persist if they can quickly exploit space as soon as it is vacated by superior competitors and/or if they can hold onto empty space longer before they are displaced. In essence, inferior competitors are able to survive in the “interstices” between superior competitors. We mapped Tilman’s model onto the case of criminal street gangs by focusing on activity patterns. Many of our general observations parallel exactly those of Tilman. Our unique contribution was to extend the model to produce expectations about the relationships between competitive ability, gang size and the directionality of violence.

The model suggests that gang size, when measured as the proportion of space used by a gang, is not a simple proxy for a gang’s competitive rank (see especially Figs. 2, 3). Gang size and competitive rank are only positively correlated if all gangs in a competitive hierarchy adopt a pure strategy for coexistence. That is, all of the gangs must either have identical activity cessation rates and leverage variable activity spread rates, or have identical activity spread rates and leverage variable activity cessation rates. If individual gangs adopt mixed strategies, then gang size fails to track competitive rank. The largest gangs can be competitively inferior and the smallest competitive superior in terms of absolute displacement ability. The models also suggest that the directionality of violence, as measured by the homicide in- and out-degree per gang, is also not a simple proxy for competitive rank (see especially Fig. 5). Large gangs typically experience more overall violence (cumulative in- and out-degree), compared with small gangs. However, variation in competitive rank (and random noise in activity cessation and spread rates) can cause a gang to flip from being a net-importer to a net exporter of violence.

We examined the implications of the models using homicide data from LAPD’s Hollenbeck Community Policing Area. Territory size is not strongly correlated with the directionality of violence between rivals, as measured by in- and out-degree over the homicide network. Territory size is only marginally better at predicting the total volume of violence. The model presented here suggests that we should not be surprised by this result as competitive ability, gang size and directionality of violence need not be strongly connected, even where absolute competitive dominance exists. The observed in- and out-degrees for the Hollenbeck homicide network is perhaps more consistent with gangs leveraging faster activity spread rates to circumvent competitive asymmetries than an alternative model of slower activity cessation rates. However, we have not performed rigorous model evaluation as there remain many unknowns that deserve further theoretical discussion (see below). Nevertheless, it is reasonable to hypothesize that gangs such as El Sereno, and perhaps Clover, are net importers of violence as a result of large size and relatively high-rank in competitive ability. By contrast, gangs such as KAM and Lincoln Heights may be net-exporters of violence because of an intermediate size and relatively low competitive rank. However, there are gangs that do not neatly align with model expectations. These outliers either have observed in-degrees that are much larger than expected for the small territory size (e.g., Primera Flats, Tiny Boys), or much smaller than expected for their large territory size (e.g., Metro 13). Assuming that the in- and out-degree counts are accurate, alignment with model expectations would require that territory sizes be adjusted upwards or downwards.

Limitations

This study has several important limitations. First, the use of homicide data may not be the best metric to assess gang dominance given that these acts of violence are likely rare when compared to other less severe options that may accomplish much the same thing (e.g., aggravated or simple assault). However, since most acts of gang-related violence involve firearms (Huebner et al. 2016 Maxson et al. 1985 Maxson and Klein 1990 Pizarro 2017 Rosenfeld et al. 1999 Valasik 2014), the only difference between a gang-related homicide and a gang-related aggravated assault may be random. Thus, more dominant gangs may attempt to utilize less severe acts of violence, however, the results may still be a homicide. Furthermore, research has shown that the investigation of homicides by law enforcement is likely to be the most robust, given that there is almost always a victim, with a specialized police unit that dedicates substantially more investigative time and effort to their resolution (Petersen 2017 Pizarro et al. 2018 Regoeczi 2018). In this study, the thoroughness of investigating gang-related homicide is expected to provide a much more complete picture of the violent event including reliable data on gang affiliations of both the target and the aggressor, two crucial pieces of information needed to the current analyses. As such, the use of gang-related homicides as the sole metric of violence is likely to be conservative measure.

It is premature to conclude that territory size is not at all a useful predictor of competitive rank. Part of the problem may be with the way that gang territories are recognized and measured in real-world settings. Recording gang territories as bounded, convex polygons may be pragmatic. However, there is good reason to question whether this is a realistic representation of the distribution of gang activity, gang areal control or gang competitive position. It has long been recognized that gangs may claim a large swath of land, but that most hanging out occurs at only a handful of locations, termed ‘set spaces’ by Tita et al. (2005). In fact, Valasik (2018) finds that areas with high concentrations of gang member residences and gang set space locations are most at risk of experiencing a gang-related homicide. It might be more appropriate to think of gang territories as a network of placed-based activity nodes and corridors or pathways between them. This would be a group-level analog of crime pattern theory (Brantingham and Brantingham 1993). Some nodes and corridors might be common to the gang as a whole (i.e., set spaces), while others might be tied to the activities of single gang members (e.g., gang member residences). Gang territories seem to overlap quite substantially when drawn as convex polygons. For example, in the entire city of Los Angeles approximately 40% of all documented gang turfs overlap according to 2010 gang territory maps. However, if territories are really a “mesh” of shifting nodes and corridors between them, then the actual equilibrium size distribution of gangs may be quite different from (and lower) than that measured using territory maps.

This concern over defining territories raises a related issue about modeling both spatial and temporal patterns of gang behavior. The models presented above are spatially implicit. They deal only with the proportion of space occupied by a gang, not the actual spatial arrangement of those gangs. The models do imply, however, that the spatial arrangements of gangs are subject to constant change. Even though gangs occupy a stable proportion of the landscape at equilibrium, there is regular turnover in which gangs occupy which sites. Such change is not consistent with the “turf-as-polygon” view of gang territoriality. It may be more consistent with the idea that gang territories are a shifting mesh of nodes and corridors. Spatially implicit models also do not take into consideration any constraints of mobility (Hubbell 2005 Turchin 1998). How far people move plays an important role in the generation of crime patterns (Brantingham and Tita 2008) and presumably plays and important role in the formation and maintenance of gang territories (Brantingham et al. 2012 Hegemann et al. 2011 Valasik and Tita 2018). Including mobility in the current model would require a spatially explicit approach. Such models are much more challenging mathematically, but frequently lead to novel insights quite different from spatially implicit models (Kareiva and Wennergren 1995 Tilman et al. 1994). Thus, it is premature to claim that faster activity spread rates will be a decisive property in a spatially explicit systems of gangs.

The models developed here offer only a limited view of competitive dynamics. We recognize that it is extreme to assume that gangs form a strict competitive hierarchy. This assumption is theoretically valuable as a form of counterfactual. It is much more likely, however, that competitive ability is context dependent (Hubbell 2005). Who has the upper hand in any one dyadic interaction may depend as much on where an interaction takes place, or who is present, as on some global competitive ability of the gang. A more detailed assessment of the costs and benefits that arise in competitive interactions across contexts is needed. For example, it is perhaps unrealistic to assume that inferior gangs will continue to attack superior gangs if such attacks never yield successful displacements. The contexts in which attacks are successful and unsuccessful may carry great importance for understanding competitive dynamics.

A related concern is whether it is reasonable to model a community of gangs as a single competitive hierarchy. Competitive interactions may be restricted to smaller clusters of gangs that exist in close spatial proximity to one another. A broader community of gangs may in fact be best modeled as a multiscale system composed of several competitive hierarchies that sometimes interact. These concerns again point us in the direction of spatially explicit models where the competitive ranking of gangs may shift across the landscape. It also suggests a role for game theory in modeling competition as strategic interactions that might include behavior other than acting as a superior (or inferior) competitor. Specifically, we believe it will be important to relax the assumption that activity spread and cessation rates for each gang are unchanging in time. These traits, if important, presumably would be under heavy selection via some learning mechanism. Inferior gangs might be put at an even greater disadvantage if superior gangs seek to close off spatial opportunities in response to competitive interactions by evolving their activity spread and cessation rates. These possibilities will require further examination.



Comments:

  1. Yafeu

    Previously, I thought otherwise, thank you very much for the information.

  2. Jamir

    I confirm. All above told the truth. Let's discuss this question.

  3. Brenn

    Class!

  4. Daill

    Well done, this brilliant idea is just about



Write a message