Skip to main content
Open Access

Undecontextualizable: Performativity and the Conditions of Possibility of Linguistic Symbolism


In this article I argue that the canceling out (or defeasing) of performative indexical functions is a condition of possibility on linguistic symbolism. I show this to be the case by looking at words and expressions—like curse words and name taboos—whose performative functions can be canceled out only with the greatest of metalinguistic labor. I show that these indefeasible or rigid performatives are the semiotic-functional converses of J. L. Austin’s explicit performatives (e.g., “promise,” “bequeath”) in terms of (i) the orders of regimentation between semantic-symbolic and pragmatic-indexical functions, (ii) the indexical anchoring of pragmatic effects within either denotationally mediated events-of-narration (En) or interactionally mediated events-of-signaling (Es), and (iii) the articulation of indexical function with speech participant roles. The article concludes with a reflection on how the architecture of the phonology-semantics interface (or duality of patterning) safeguards symbolism by impeding the processes of runaway semiotic naturalization that produce rigid performativity.

If J. L. Austin (1962) had begun with blasphemy rather than baptisms and bequeathals, the theory of performativity might have proceeded in rather different directions. Of course, he didn’t begin with swear words (though he does mention them in passing), he began with examples like: “I hereby dub this ship the ‘Queen Elizabeth II,’” said by a duly authorized person while breaking a bottle of champagne across the hull of the very boat thereby named. That is, he began with explicit performatives, those sentences that seemingly accomplish just what they describe.

In this essay I will suggest that explicit performatives and verbal taboos (a natural linguistic kind I will characterize as “rigid performatives”) are complementary phenomena, two poles of a continuum of language-mediated performativity. It is my hope that by clarifying the parameters that render the performative a heterogeneous functional space we will be able to anchor a comparative framework for the study of social indexicality. Nevertheless, I am aware that many semiotically inclined anthropologists are disinterested in comparative approaches. For them, it is my hope that this essay will serve as a reflection on the particular conditions of possibility of linguistic symbolism, nestled (and sometimes swallowed up) within boundless thickets and thorn-bushes of iconic indexicality. Though I am interested in verbal taboo in comparative perspective, in Section 1 I largely restrict myself to English (and French) curse words as the empirical point of departure for my reflection. I do this because I wish to employ a set of materials that are familiar to most readers in the hopes that this will render the theoretical distinctions that I make more palpable.1

1.  Explicit and Rigid Performativity

A central aim of this article is to demonstrate that explicit performativity (henceforth EP) and verbal taboo represent two extremes in the articulation of pragmatic function (or the speech act function) with metapragmatic discourse and function. This argument relies upon Michael Silverstein’s theorization of metapragmatics (Silverstein 1976, 1987b, 1993). I therefore begin with a recontextualization of Austinian explicit performativity in terms of metapragmatics.

Explicit Performatives as Metapragmatic Verbs

Silverstein observes that it appears to be a universal that in all language communities some set of words and expressions used to symbolically refer and predicate about speech acts (i.e., metapragmatic verbs and nouns) can also be used to indexically accomplish speech acts. English verbs like bequeath or command and noun phrases like warmest regards and salutations can be used to refer to speech acts as well as to accomplish them. We can see this here in examples 1 and 2:2

1.  Jesse commandedEn ≠ Es Paul to leave.

2.  I commandEn = Es you to leave.

In How to Do Things with Words, Austin notes this double aspect of performative verbs. However, before he adopts the locution-illocution-perlocution distinction in lecture 8, he assumes that “constative” and “performative” uses form disjoint sets. He argues that though the EP “serves to inform” interlocuters of the character of the act, it doesn’t describe it (Austin 1962, 6). Silverstein (1979) avers from this characterization. The grammatical particularities of “explicit” constructions (i.e., 1st person subject, 2nd person [indirect] object, present tense), characterized by the “reflexive calibration” of denotational indexicals onto the event of signaling, is every bit as motivated to serve as an act of referring and predicating as any other referential mapping of semantic roles onto persons present or absent or of verbal inflections of tense and aspect with respect to the event of speaking. It is true that the metapragmatic verb functions (where felicitous) as a command (promise, bequeathal, etc.) and not as a modalized description of such activity. But that doesn’t mean that the explicit form lacks propositional content supplementary to that indexical accomplishment (cf. Agha 2007, 41, on modalization of propositional content).

Because the signal form employed in giving a metapragmatic description and the signal that accomplishes the pragmatic act are one and the same, it may be difficult to grasp this universal of the metapragmatic lexicon; to repeat: in all languages some entries of the metapragmatic lexicon can both be employed (in some set-1 of contexts) in metapragmatic discourse that characterizes some conventional pragmatic act without accomplishing the act described, and (in some other set-2 of contexts) to accomplish that same act. It may be heuristic to consider this double functionality by means of a nonlinguistic example: It is difficult to imagine that the act of shaking someone’s hand could, in one set of situations, count as an emblem of solidarity between two individuals but, in another set of situations, count as a description of that solidary act.3 It is hard to imagine that honking the car horn could sometimes be to warn, rebuke, alert, and so on, other motorists, but in some other “naturally occurring” set of cases be used to refer to acts of warning, rebuking, alerting, and so on. The EP describe/do duality is a particularly linguistic duplicity, while performativity—as car horns and hand shakes attest—is a phenomenon by no means circumscribed by the linguistic.

The peculiar and unique property of language that serves as the affordance for EPs is the reflexive capacity of language to re-present that discursive event in this one. This is accomplished by creating a linguistic icon of the earlier event and surrounding that icon with framing material (e.g., a modalized verb of saying) that referentially indexes the nonequivalence of the narrated event (En) and the here and now event of signaling (Es) (e.g., “He said,En < Es ‘I promise you …’”). What Émile Benveniste (1966) called delocutive words illustrate how, where canalized, this re-presentation of performative signals while signaling the nonequivalence of the performative event and the here-and-now event of signaling may be generative of new metapragmatic lexemes. Benveniste noted that there are a series of verbs denoting speech acts that appear to be derived from what Austin called “primary” or “primitive” performative locutions. In such delocutive derivation there is a maximal tightness of linkage between reporting and reported event. Here, instead of a verb like say (or dire) framing the speech act as quoted material, the quoted material itself serves as the verb of speaking. To illustrate this process, we use the example of the (primitive performative) phonation—“shhhhh”—employed to silence people. Compare:

3.  Shhhhh!

4.  The librarian said “Shhhhh!” to me.

5.  The librarian shushed me.

The delocutive verb to shush is derived by “rank-shifting” (Silverstein 1979, 240) the sonic substance that functions in a primitively performative manner to silence interlocuters in Es into a metapragmatic descriptor characterizing some En. To accomplish this, an epenthetic vowel is inserted (sh-ə-sh) rendering the sequence in accordance with the phonotactic rules of English—that is, transforming it into an acceptable lexical form (or symbolic type) in English. Once rank-shifted, the signal (shush) can be used to report acts of censure not (represented as) accomplished by means of the shhh sound sequence. (“They shushed me when I tried to bring up the topic of trash removal at the municipal council meeting” needn’t describe an event where shhh is purported to have been employed.) This shows that shush describes a kind of social action—silencing others in some respect and accomplishable by diverse means—not necessarily (even if protypically) linked to the production of the phonetic sequence [ʃʃ:]. There is a many to one relationship between pragmatic object-signs and the metapragmatic description or classification of those acts.4

Benveniste’s delocutive verbs offer a diachronic account for why pragmatic form and metapragmatic form are often icons of one another. For verbs like tutoyer and vouvoyer, this is a one-way street. The pragmatic forms are rank-shifted into descriptors that cannot again be employed to accomplish the act they describe. (“Je te tutoie” is performative because of the pronominal clitic, not because of the verb stem.) For others, however, this is a “productive” dualism, even synchronically. Note that shush, the lexicalized form of [ʃʃ:] can be used to, well, shush people. This is still more primitive than a would-be explicit (*“I hereby shush you.”). Still, in other cases the delocutive form may be integrated into the explicit frame (e.g., “I salute you”; cf. “Salutations!”).

If the delocutive pathway described by Benveniste involves the movement from pragmatic function2 to metapragmatic function1, in the case of explicit performatives (EPs) we can observe that this iconic identity between signal that accomplishes and signal that reports the accomplishment is not merely the artifact of a historical sequence but a synchronically productive dualism.5 There are three dimensions of this duality that Silverstein emphasizes in discussing this particular and special case of pragmatic-metapragmatic relations:

i.  The pragmatic signal is an iconic indexical where, in its iconic dimension, it is an icon of the metapragmatic symbol:

How does [the explicit performative] work? It seems clear that the performative verb itself, as one of a set of metapragmatic descriptors that designates a type of speech event as an enactable, accomplishable relationship between a speaker and an addressee, specifically classifies the type of contextual creativity, or entailed consequences, that the instantiated event, as so predicated/designated, is understood to effect. In short, it specifies the conventional functional1 type of which this specific speech event—anchored to a specific speaker and addressee—is understood to instantiate as a token. Predicating the type with a token of the type … makes the instance a special “iconic indexical,” or replica-in-actuality.

(Silverstein 1987b, 34)
In the uttering of a felicitous EP (e.g., “I hereby promise you”) the indexical act (a promise) is an icon of the metapragmatic symbol (to promise) whose tokens mediate the accomplishment of the act.6

ii.  The iconic identity between metapragmatic symbol and performative index is maintained (supported, sustained, underwritten, etc.) by the referential biasing of folk metalinguistic consciousness (see in particular Silverstein 1979, 1981). That is, as symbol users, humans have a tendency to identify the act with their characterization of it (or, in “referentialist ideologies,” to equate the characterizing characteristics of language with language itself). This is a connection that is diachronically reflected in nonexplicitly performative linguistic types as well. So, for instance, honorifics are most often instantiated through distinct formal means of referring to the targets of honorification (e.g., First Name versus Title + Last Name), though this is by no means the only possible or attested way to enact deference through language (Agha 2007, 315–22). Referentialist biasing tends to conflate or aggregate sociopragmatic and semantico-referential functions, and thus acts as a strong bias in the diachronic development of metapragmatic ideologies and pragmatic repertoires. EPs are an exemplary case in point.

iii.  In the case of EPs, pragmatic functioning is relatively “transparent” to metapragmatic ideology (Silverstein 1979, 210). That is to say, ideology approximates the linguistic facts that it seeks to rationalize. Metapragmatic transparency is a term used elsewhere by Silverstein (1987a, 160–62) in describing the character of “true” (i.e., 1st and 2nd person) pronouns (Benveniste 1966). Person deictics, as form types, specify metapragmatic rules of use that determine the reference of their tokens. The 1st and 2nd person pronouns have inherent metapragmatic content; 1st person pronouns specify that their tokens should be interpreted as referring to the (represented) Speaker of that token; 2nd person forms specify that the (represented) Addressee of the token is included in its denotation; 1st person exclusive forms that the (represented) Addressee is not. The relationship of code-level metapragmatics to discourse-level pragmatics is “transparent” for the case of pronouns in the sense that all that a speech recipient must know is the inherent metapragmatic content of the type (= “pragmatic rule of use”) in order to identify the referent of its token.7

There is thus an intriguing affinity between person deictics and metapragmatic verbs. Indeed, in his touchstone 1976 paper (“Shifters, Linguistic Categories, and Cultural Description”), Silverstein characterizes explicit performatives as “metapragmatic shifters.” It is instructive to think through the similarities, but also the differences, between pronouns and explicit performative verbs. The similarity between shifters and EPs lies in the fact that in both cases, linguistic types provide an exhaustive metapragmatic characterization of their token-level pragmatics (i.e., they have this as their inherent “metapragmatic content” [Silverstein 1987a] that provide “pragmatic rules of use” [Silverstein 1976] for interpreting their tokens). The difference is that pronouns offer that characterization for the act of reference. Explicit performatives offer it for the particular act specified in the type-level semantics of the verb (e.g., promising, bequeathing, betting, etc.). Pronouns specify the referent of a token, performatives specify the act. The cross-functional dimension—from semantic to purely pragmatic—is unique to explicit performatives.

To review: EPs are unique in that they characterize and accomplish the same social acts. But (and this is a point I have yet to demonstrate) they can only do this because of a complementarity between events of accomplishing and events of characterizing. The events of characterizing cite—quite literally in the cases of delocutive verbs—the acts that they characterize. Performative uses, meanwhile, cite their metapragmatic descriptors; the pragmatic act cites the semantic meaning (Nakassis 2013). If what I say is accurate, there is something of a necessary tradeoff between the specificity (“explicitness”) of illocutionary acts and their “force,” where we understand the force of some pragmatic function to be equivalent to its resistance to decontextualization. EPs accomplish highly specific acts, but their performativity evaporates upon decontextualization, as when the act is reported:

6.  She said: “I promise you” to him.

or simply

7.  She promised him.

In these reportive collocations, the performative force of the linguistic signal /pɹɑmis/ is defeased. And of course this makes sense given the analysis we have just given of the iconically mediated cross-functional transformation of symbolic referring and predicating into indexical doing. The trompe-l’œil of the explicit performative is only effective where Speaker refers and predicates Speaker as subject and Addressee as (indirect) object, and where no deictic operator with more global scope intervenes to make EsEn (e.g., a verb of speaking, or nonlinguistic contextualization cues like a playbill at the theater). Explicit performatives are effete—the moment that EsEn, their force evaporates. But this indexical frailty is their symbolic strength. It is precisely because the symbolic function can stand apart from its corresponding nonreferential indexical function—that is, is prescinded (to use Peirce’s preferred term) in naturally occurring discourse—that it can be used to refer and predicate about social acts in a flexible and open-ended manner. Its symbolic function is not tied down by its indexical function. As I will argue, prescinding in practice is critical for the reality of symbolism to be realized in communication.

*I Hereby Insult You!

We are, at this point, prepared to set up the structural contrast between explicit performatives (EPs) and verbal taboos that I promised at the outset. We might begin with a little bit of gardening, weeding around what linguistic species we wish to discuss. In this section I will be concerned to give a semiotic functional characterization of verbal taboos of a particular sort. In particular, I will be concerned with words and expressions that correspond to a particular semiotic-functional description. Thus our units of analysis are roughly comparable to the EPs—lexical units with a phonological form. I will not be discussing what are sometimes in political discourse referred to as taboo topics (e.g., homosexual conversion therapy, third trimester partial-birth abortions, etc.). As with our discussion of EPs, we are interested in the intersection between language-structured symbols and performative indexicality.

We should also right away discard a kind of preconception that sometimes affects thinking about verbal taboos whereby taboo expressions are seen as reducible to the affective social categorization of their real-world denotata. The sociocultural construction of euphemism and dysphemism is, certainly, an important empirical dimension of verbal taboos (Leach 1964; Allan and Burridge 1991). But we should remember that though the semantic sense and reference of taboo expressions may be a first step in the diachronic development of their pragmatics, it is far from being the last. Indeed, what will be of interest to us here is how differently the relationship between semantic “explicitness” and pragmatic “force” develops in this case, and this difference relies precisely on the tenuous relationship of verbal taboos to the dimension of semantic signification.

The problem with a narrow euphemism treatment of taboo language can be illustrated with a witticism from S3E5 of The Last Man on Earth, where instead of saying that so-and-so was “pissed off,” Carol euphemistically says “urinated off.” The joke consists in forcing a recognition of how far the substitute (or target) expression misses the mark in terms of achieving the pragmatic force of the avoided (or source) expression. The way in which it misses that mark is by literalizing the etymological (or first-order) semantics of piss ‘to urinate’. A semantic solution is offered for a pragmatic problem. Of course, the expression to (be) piss(ed) off has nothing to do with urine or urinating. The verb to piss off might be glossed as ‘to cause someone to be very upset’. Crucially, its etymological connection to piss, of the liquid variety, is built on a pragmatic, rather than a semantic, analogy. With pissed off we encounter a kind of second-order semantics that seems often to emerge in the taboo lexicon and its surround, suggesting a motivated diachronic pathway (see fig. 1). The relationship between first- and second-order semantics here can be contrasted to the relationship between primary performatives and delocutive metapragmatic lexemes that we discussed above.

Figure 1. 
Figure 1. 

Second-order semantics of English and French curse words. The Quebecois French data draw on Vincent (1982).

In his discussion of EPs, Austin notes that not all descriptions of speech acts can themselves function as speech acts. He chooses as an example the infelicity of “I insult you,” which, uncharming though it may be, rather fails to pack the desired verbal punch (Austin 1962, 31; cf. Silverstein 1987b; Agha 2007, 61). He uses this observation to make the important point that a convention must exist—that the descriptive backing of the (as we would call it) metapragmatic lexeme is not enough to ensure performative “felicity” of its Es-co(n)textualized tokens. He is certainly correct that a convention does not exist and that “I insult you” could theoretically be insulting should such a convention exist. However, in reflecting on the empirical absence of such a convention in English, as opposed to its theoretical possibility, it may be more productive to think the question from bottom-up rather than from the top-down (where “up” stands for “higher”-order metapragmatics and “down” for the “lower”-order pragmatics they regiment). Perhaps it is not so much that the metapragmatic term cannot be employed in performative enactment as that the performative act cannot easily be cast as a delocutive citation, which impedes the development of the kind of metapragmatic-pragmatic duality characteristic of EPs for the case of insulting. Certainly there do seem to be a whole distinct set of, one might say, register concerns that appear to enter into the fray when it comes to describing actions with verbal taboos as opposed to with canonical EPs like promise and bequeath. Take the following well-formed metapragmatic characterization of interactions:

8.  He was really pissed off at me.

9.  I totally fucked up the exam.

10.  They shat on my idea.

Note that though these are metapragmatic descriptions, they are not delocutives in Benveniste’s sense. The events that these taboo-employing expressions characterize are not performative acts accomplishable by the terms in question. (It is true that our use of “metapragmatic” is no longer linked specifically to the metapragmatic characterization of some performative “illocutionary” act. These are nevertheless still metapragmatic descriptions. When one says, “He was really pissed off,” one is giving a metapragmatic characterization of some set of interactional signs framed as indexing the affective state of being upset.) Functionally interwoven with their second-order semantics, utterances like 8, 9, and 10 also have “expressive” or “intensifying” functions (see, for instance, Potts’s [2005] work on referent-focal expressives like jerk). But how does the intensification work here? Some pragmatic function of the taboo expression is channeled in such a way as to “intensify” the metapragmatic characterization of some En. The connotation—if connotation were to denote a performative effect—of piss is channeled toward the referent (“He was [soooo] pissed”), characterizing by pragmatic signification just how upset the individual was. With Derrida (1971), we might say that this is not so much a polysemy of piss as a dissemination of its pragmatic force across elements of the code.

That this channeling of pragmatic force draws upon (Animator-focal) register distinctions means that it is an expressive relationship whose appropriateness does not depend only upon a certain stance that Speaker (Es) enacts with respect to the Referent (En) but also upon the speech Animator’s relationship to speech Recipients in the here-and-now event of signaling (Es). Terms that are less often normatively “appropriate to context” (cf. fuck > screw > mess in fig. 2) express stronger or more intensified affective stances with respect to the narrated event.

Figure 2. 
Figure 2. 

Cursing “speech levels” keyed to semantic functions, above, keyed to purely expressive functions, below.

In expressions like fuck up/around/with, a pragmatic “force” centered in the interactional here and now is brought into juxtaposition with modalized reference to other events. This perduring “force” of fuck in the here-and-now event of signaling can be expressed in a technical language; it is a function of the indefeasible performativity of tokens of this lexical type—that is, fuck is a rigid performative (Fleming 2011). Here we import an anthropological understanding of taboo as a categorical rule of iconic-indexical causation into the domain of language. Taboos are categorical proscriptions that in this sense break with context in resisting contingent co(n)textualizations of the local function. Globally, recontextualization is always possible, as in ritual inversions of taboos where the normative estimation of the taboo act is reversed even if its (local) force is not defeased. Nevertheless, the rigid performative (henceforth, RP) has a perlocutionary “violence” by virtue of the way in which the occurrence of a taboo token ruptures or bifurcates context into a qualitatively distinct frame before and after the occurrence. This is what makes taboos so productive of moral, political, ethical frames (cf. Butler 1997 on the “sovereign performative”).8 Mike Pence’s rule that he not have a meal alone with a woman other than his wife is a case in point (Blake 2017); the proscription has the perlocutionary effect of sexualizing contexts of isolated cross-gender co-presence, forcing one particular contextualization of gender difference over an infinitude of other possible contingent contextualizations of co-presence.

To return to the second-order semantics of curse words, a purely nonreferential, and indefeasible, indexicality at work in the “interactional text” is placed side by side with a particular swatch of “denotational text” (Silverstein 1993). Diachronically this juxtaposition of pragmatic force and denotation generates the second-order semantics of curse words. Evidence for this can be found in the fact that the symbolic content of these second-order forms emerges in the characterization of negatively evaluated denotata (whether they be disreputable, stigmatized, immoral, unpleasant, etc.). Here there appears to be a diachronic transference between pragmatic force and pejorative semantics. Take the adjective shitty. One might be tempted to interpret it as merely an expressive. But it should be observed that it contributes a sense (roughly equivalent to bad) that changes the truth-conditional semantics (“They have a TV” doesn’t have the same propositional content as “They have a shitty TV”). This is not, of course, to say that curse words never have purely nonreferential effects—of course they do, as in infixation of fucking (compare “Unbelievable” and “Unfuckingbelievable”). Rather, the conjoining of that expressive function in the here-and-now Es with a denotational co-text concerning some En can drive the development of “dysphemic” semantics (Allan and Burridge 1991). The channeling of the expressive function toward (rather than its placement in) En necessarily involves an iconic indexical projection from Es into En, which is the functional converse of the explicit performative. Here the exhibited effect is interpreted as an iconic indexical of that co(n)textually foregrounded or juxtaposed figure (e.g., the individual referred to by fucker is iconic with the normative evaluation of the act of saying “fuck”).

Metapragmatic Blocking

At this point we are in a position to understand why it is Austin’s “A”-class infelicities (the existence of “an accepted conventional procedure …” [1962, 14]) that impede “*I hereby insult you” from actually being insulting: the terms and expressions that are conventionally employed to insult cannot be bracketed within a narrated event in the way that EPs can. They cannot be cited, mentioned, iterated, exhibited, or en-token-ed without triggering an efferent performative effect in Es. This is another way of saying that they cannot function as purely symbolic metapragmatic descriptors—their delocutive path is blocked by the perdurance of their performativity. I will call this metapragmatic blocking.

As the second-order semantics of curse-words (e.g. to piss off, to fuck up, shitty, etc.) illustrate, the indexical functions2 of verbal taboos are incorporated into their functional1 symbolic characterizations of narrated events. Unlike the isomorphic lamination of functions “1” and “2” in EPs, however, this incorporation is always partial and crucially dependent upon a logic of what we might call interactional sacrifice—their metapragmatic function is accomplished only by means of exhibiting the performative effects of the pragmatic types whose bodies these second-order forms haunt.9 It is this iconic-indexical juxtaposition that accounts for the interesting correlation that we have noted between stereotypes of register use (the normative negative estimation of cursing as an emblem of speech Animator character) and the normative negative estimation of the actions, events, and persons described in employing the second-order semantics of curse words. Here, just as with the explicit metapragmatic lexicon, we have an ideologically guided historical process. In that case, the pragmatic and metapragmatic functions superimposed upon the same signal (e.g., promise) were iconic with each other in terms of the particular conventional illocutionary act they described/accomplished. In the case of curse words such a rhematic alignment is impossible. Here the derived metapragmatic function1 is enabled by exhibiting the pragmatic function2 of the signal. What the sign describes and what it accomplishes are not aligned in terms of an iconic sketch of the conventional “illocutionary” act—the act of being fucked up (i.e., inebriated) is not an icon of the act of using the word fuck (i.e., cursing). Rather, they are aligned in terms of an iconic sketch of the “perlocutionary” effect of the act. Cursing and being drunk are socially disapproved of; indeed, that normative evaluation is part of the perlocutionary effect of these acts. A similar logic governs the pragmatically driven polysemy of other curse words; saying shit is, as it were, shitty, and so on.

Primitive Performativity

In the discussion so far we have used the duality of explicit performativity and metapragmatic description as a model for thinking about verbal taboos. We have concentrated, in particular, on the difficulties of employing verbal taboos in metapragmatic function. Now we turn to uses of curse words as employed in their most interactionally ritualized and formulaic mode—that is, in the acts of cursing that most closely approximate explicit performative speech acts. Formulaic, conventionalized acts of cursing at someone, at least in American English, are more akin to Austin’s primitive performatives than to his EPs. Take, for example, the highly conventionalized form for insulting addressee: “Fuck you!” Now if the second-order semantics of, for example, fuck around and fuck up most closely approximate EPs in (exclusively) metapragmatic reportive function, “Fuck you!” most closely parallels the EPs in performative function. Here, at least, the addressee is explicitated. It is true that the speaker makes no grammatical appearance. Indeed, were the speaker to occur in the expected EP-subject position it would be rather infelicitous (“*I [hereby?] fuck you”), perhaps read as awkwardly bringing the expression back to some first-order semantic (i.e., sexual intercourse) reading. “Fuck you!” is certainly more of a primitive performative than an explicit one. That which is done—to insult—is not that which is described (if our argument concerning metapragmatic blocking is correct, in some real sense it can’t be). The problem is more profound than this in fact, since the lexeme fuck employed here is senseless. The echo of the explicit consists in the 2nd person singular pronoun that “explicitates” the target of the verbal attack. But fuck has no semantic entailments in this utterance, something readily seen by its ungrammaticality. (The alternative, “Go fuck yourself,” may be an attempt to bring “Fuck you” back into the domain of the sensible by imposing on it an imperative reading that would require a reflexive pronoun.)10 In other conventional expressions like “Fuck off!,” there is only the slightest trace of the constative function—in some sense the Speaker is ordering the Addressee to go away (cf. metapragmatic “He fucked off” as roughly synonymous with “He went away”). Fuck off (cf. register “up”-grade, (to) piss off), evidences the complex balancing act between semanticity and pragmaticity in this domain. Here the subjectless infinitive is clearly analyzed on analogy to the imperative. Saying “Fuck off!” is an illocutionary act roughly analogous to saying “Go away!” These Es-centered conventional cursing formulae can themselves give rise to what I have called in figure 2, delocutifs manqués (e.g., he fucked/pissed off or décalisser/décrisser for Quebecois French). The interested reader should compare the dissemination of indexical effects here with the polysemy of krama andhap ‘predicates of dispossession’ as described in Errington 1988.

More purely expressive is the use of the bare taboo form in what Goffman 1978 called “response cries”—the Shit! of a spilled coffee, the Fuuuuuck! of a jaw-dropping revelation (cf. Seizer 2011). Here taboo expressions are wholly senseless—there is no recoverable propositional content, even if there are quite highly socially structured indexical entailments of usage linked to context (Kockelman 2003). Referent-focal uses of fucking infixing (as in “Los fucking Angeles”) or fucking as an adjective (as in “a fucking problem”) are similarly senseless. Notably, it is when, and only when, fuck is senseless (or purely “expressive”) that phonologically derived substitutes can serve as paradigmatic alternants (see bottom of fig. 2). For me at least, “He was {fucking / frigging / flipping} stabbed” are all utterances that fit within some register classification, whereas only fucked is felicitous in “They were {fucking / *frigging / *flipping} in the bedroom.” Diane Vincent (1982) reported the same finding in her study of blasphemy in Quebecois French.11

There is a profound connection here between the functional instantiation of taboo terms in a purely indexical (nonsymbolic) mode that has no semantic sense construal associated with it and the use (or avoidance) of phonologically iconic variants of taboo terms. The phonological form of the signifiant is no longer an absolute and final limit condition on performative function when it is wholly decoupled from its semantic signifié. In these verbal taboos, untethered from even the most virtual “constativity,” performative function may become essentialized as inherent in a phonetic substance devoid of semanticity. I will return to this connection in Section 3.

Summary of Section 1

At this point I think that we are prepared to give the broader framing of the structural opposition between explicit performativity and rigid performativity (or verbal taboo) promised at the outset. This is profitably conceptualized in terms of a distinction between symbolic-descriptive and indexical-performative functions of signal types. Remember, in both kinds of phenomena we are concerned with a doubling of “the same” signal (e.g., fuck as purely expressive or fuck as a sense-bearing lexeme; promise as purely descriptive or promise as public commitment to a course of action). Our question then will be how symbolic and indexical functions are distributed over these doublets.

EPs and RPs represent different relationships of domination between symbolico-semantic and indexical-performative functions. In the case of EPs, the symbolico-semantic richness of metapragmatic function envelopes the performative function. The explicit performative occurrences account for just a small subset of the tokens of the signal type. And even in these cases, a semantic (constative) function co-occurs with the pragmatic (performative) one. The signal /pɹɑmis/ always has a semantico-referential function of denoting events of public and declared commitment to do some thing (see the left side of fig. 3). In the case of RPs, the orders of regimentation are reversed. Here the indefeasibility of the signal type means that all tokens have performative effects. All tokens have an expressive function, and indeed many are purely expressive (= purely pragmatic). That is, they contribute no semantic-symbolic content to the proposition (cf. “what a [fucking] terrible book,” “what she did is totally un[fucking]acceptable,” etc.). The -fucking- infix in “Unfuckingbelievable” is purely indexical-performative. Of course, some subset of tokens of fuck do make a contribution of propositional content whether that be a literal first-order semantics (“They fucked in the Lincoln bedroom.”) or a second-order semantics (“When I got to the bar, they were already fucked up.”). But that semanticity always co-occurs with a pragmatic residue—these utterances still count as instances of the social act of cursing. Here the indexical-performative function englobes the symbolic-semantic one (see the right side of fig. 3). Indeed, as we have seen, the second-order semantics of these expressions are diachronically guided by a pragmatic-to-semantic analogy where stance-effects (Speaker’s performed orientation toward En) are synchronically subtended by speech register considerations proper to the local, here-and-now interaction.

Figure 3. 
Figure 3. 

Inverted relations of encompassment of symbolic-descriptive and indexical-performative functions for EPs (left) and RPs (right). Area enclosed by a dotted line indicates a domain of denotational function, whereas areas enclosed by a solid line indicate a domain of performative function.

The orders of regimentation between meta-sign and object-sign are reversed. For

explicit performatives whereas for rigid performatives
semantics > pragmatics pragmatics > semantics
symbolism > indexicality indexicality > symbolism
constative > performative performative > constative
denotation > interaction interaction > denotation
where the “>” sign should be read as the item on the left dominates, regiments, conditions, metapragmatically determines, and so on, the item on the right. In the final analysis it will be this relationship between interaction and denotation (or “interactional-text” and “denotational-text” in the terminology of Silverstein 1993) that will be of most interest to us.

Our discussion has thus exposed a kind of complementarity between EPs and RPs: with EPs, we have a highly precise and rich characterization of the act accomplishable by use of the signal, but that signal is relatively lacking in “force,” in the sense that tokens are highly defeasible, subject to performative unhappinesses of one kind or the other. These are linked propositions; it is precisely because a semantically rich denotational text englobes and fully regiments interactional text that explicit acts have their nuance and specificity (compare promise with bequeath with swear an oath with commit, etc.). In the case of RPs there is a high “force” of the performative signal; it is impossible to defease the pragmatic function of tokens (an issue logically distinct from normative acceptability).12 The social act accomplished by the act rests, however, ambiguous (i.e., it is not metapragmatically regimented by the signal itself). As we have seen, this complementarity is mediated by metapragmatic blocking.13 But even where normative proscriptions do not impede the integration of RPs into metapragmatic discourse, the efferent pragmatic function will contour the semantics of the signal qua metapragmatic lexeme (cf. fuck up/around/with). The point is that the descriptive function cannot, in the case of RPs, occur in a fashion prescinded from the pragmatic function.

2.  Participant Roles

At this point I hope that one layer of the argument is coming into focus. A condition of possibility of metapragmatic discourse (itself a particular kind of symbolic function) is the possibility of defeasing, bracketing, or prescinding the signals that will supply that metapragmatic lexicon from the pragmatic effects that they elsewhere accomplish. To anticipate, I will argue that such a bracketing is necessary for symbolism in general—that decontextualizability is a condition of possibility of symbolism. We can see this by focusing in particular on the relationship between semantic symbolism and reported speech constructions. As we have already seen, for EPs, metapragmatic symbolism is prescinded from performative indexicality where EnEs, that is, in instances where there is a “reportive calibration” of the metapragmatic description inscribed in the denotational text onto the ongoing interactional text (Silverstein 1993). The doubling effect of reported speech is not exclusive only to performative function. The connection between semantic sense and reported speech was long ago recognized by Gottlöb Frege, who observed that represented speech constructions are the naturally occurring discursive contexts in which the semantic senses of words and expressions are prescinded from their denotations (Lee 1997, 35). Just as metapragmatic symbolism is prescinded from indexico-performative function in reported speech, so too does quotation abstract semantic sense from indexico-referential function. There is, then, an intimate connection between symbolism and the framework of speech participant roles, for to engage in metapragmatic discourse is to presuppose Speaker and Addressee as participant roles abstractable from the quasi-physical nodes of signal sender and signal receiver in the here-and-now event of signaling.

In the explicit performative, of course, the participant roles of Speaker and Addressee are overtly coded as 1st person subject and 2nd person (indirect) object, respectively (“I promise you that …”). As I show below, it is often difficult to give rigid performatives (RPs) a similarly straightforward mapping onto participant roles. As we will see, this difficulty emerges from the same problem of the perdurance of the efferent pragmatic function that motivates metapragmatic blocking.

In this section I illustrate the connection between performativity and participation through the analytic of indexical focus. Indexical focus is an analytic framework proposed by Agha (1993) that is intended to characterize how nonreferential indexical functions are mapped onto speech participant roles, innovated for honorifics: “The signaling of deference entitlement (or, simply, deference) appears to have the structure “deference to somebody from somebody,” or more precisely “deference to [role1] from [role2].” I will say that the interactional role category to which deference is directed is the focus of deference, and the interactional role category from which the deference emanates is the origo of deference, so that deference in this sense is always “deference to [rolefocus] from [roleorigo]” (134). Though I make no substantive alteration to Agha’s model, for technical reasons that needn’t bother us here, I employ the term target for Agha’s focus and the term focus for the set of interactional roles whose social predicates are presupposed by the index (i.e., {[origo + target] = focus} for an honorific).

The analytic of indexical focus presupposes a relatively stable comparative framework for participant roles. The generalizability of frameworks for participant roles has, however, been the cause of debate in linguistic anthropology. The foundational text here is Erving Goffman’s “Footing” (1981), where the author decomposes both the sender and receiver roles into multiple subpartitions. The Goffmanian reflection introduces a helpful distinction between what we might call the etic substance and the emic realization of participant roles:

In canonical talk, one of the two participants moves his lips up and down to the accompaniment of his own facial (and sometimes bodily) gesticulations, and words can be heard issuing from the locus of his mouth. His is the sounding box in use, albeit in some actual cases he can share this physical function with a loudspeaker system or a telephone. In short, he is the talking machine, a body engaged in acoustic activity, or, if you will, an individual active in the role of utterance production. He is functioning as an “animator.” Animator and recipient are part of the same level and mode of analysis, two terms cut from the same cloth, not social roles in the full sense so much as functional nodes in a communication system.

(Goffman 1981, 144)
Here then, animator and recipient make reference to the necessary linguistic and pragmatic competences of signals, senders, and receivers and to the spatio-temporal affordances of a channel. They make reference to the conditions of possibility of language-mediated human social interaction. This is the etic framework of interaction.14 To what degree does this etic grid for participation condition and constrain emic realization?

Two scholars who grapple with this question while trying to extend Goffman’s model are Levinson (1988) and Irvine (1996). In “Footing,” Goffman had broken the role of speech production into animator, author, and principal, and the role of speech reception into ratified and unratified recipient, overhearer, and eavesdropper. Levinson seeks to further decompose participant roles by means of a distinctive feature notation. Irvine countenances against this method, arguing instead that we should focus on the “fragmentation process” through which participant roles are creatively and productively rendered ambiguous. As Irvine’s argument by exemplification effectively illustrates, the comparative structure of participant roles should not be looked for in an evermore fine-grained dissection of role fractions. The more we move in this direction, the more we are moving into the space of highly culturally particular realizations of participation.

This doesn’t mean that there is not cross-linguistic convergence in what we might call a first-order stratum of emically realized participant roles. Even Irvine—the relativist to Levinson’s universalist—suggests that a tripartite distinction between “Speaker, Addressee, and third parties present and absent” (Irvine 1996, 135) appears to be universal. The basis for this distinction is, of course, the tripartite distinction of the indexically referential grammatical category of person into 1st, 2nd, and 3rd. Here Irvine follows Benveniste’s (1966) line in “Subjectivité dans le langage.” Benveniste argues that subjectivity is a grammatical effect of person, and in particular of the reflexive predication of self as subject that occurs in the use of the 1st person: “La ‘subjectivité’ dont nous traitons ici est la capacité du locuteur à se poser comme ‘sujet.’”

But if we are to identify Speaker and Addressee with facts of language structure (broadly construed) it should not just be with person, which is only the most emblematic (since word level, or at least easily segmentable, in the case of person agreement) of grammatical constructions that presuppose the “psychological reality” of the speech event. More important is the reflexive representation of En in Es—that is, the universality of represented speech constructions. All languages have the capacity to represent speech in speech. The possible nonexistence of the grammatical category of person in some Southeast Asian languages (a fact glossed over by Benveniste) or (and this is a subject of some debate [Todd 2009]) in numerous sign-languages, does not mean that users of these languages lack some conceptualization of Speaker as a subject whose agency manifests itself in her or his ability to consummate discourse. Every language community has metapragmatic terminology that represents language as goal-directed intentional activity mediating relations between senders and receivers.15

It is precisely in this reflexive representation of speech in speech (sign in sign, writing in writing, etc.) that there emerges a disjuncture between a speaking subject who speaks for him- or herself at one moment and for another at another. It is the metapragmatic machinery of “saying”—the name that speaking has for itself—as much as that of “I”—the name that Speaker has for itself—that seems crucially important here. The roles of Speaker and Addressee only emerge where the etic space of speech production is differentiated from itself. The role of Speaker emerges only when the Animator role is not inhabited by the same person who is the (now) Speaker of the utterance; the role of Addressee only emerges where that role isn’t inhabited by the (here-and-now) speech Recipient. There is an important connection then between EPs and speech participant roles. What it means to be a Speaker is given its rich significance only through metapragmatic discourses in which Speaker (En) is not the same as Animator (Es). That is, it is given its specification through the very same metapragmatic lexicon whose entries function as EPs when reflexively calibrated to Es.

As represented in figure 4, the emic participant roles of Speaker and Addressee emerge from the etic substance of speech production and reception (i.e., the Animator and Recipient roles). At the same time, the Referent role emerges from a space of possibility of indexing others (i.e., a maximally residual and underspecified Nonparticipant indexicality).

Figure 4. 
Figure 4. 

Relationship between etic ground for, and emic realization of, participant roles. (Etic ground is enclosed by the thinner outline, emic role by the outline in bold.)

The Speaker-Addressee Dyad as a Whorfian Universal

So the crystallization of Speaker and Addressee as distinct from Animator and Recipient emerges in “the relationship between the quoted ‘I’ [or ‘you’–LF] of discourse and the indexical referential ‘I’ [or ‘you’] of the language code” (Urban 1993, 29). But then what motivates the strongly dyadic character of the Speaker-Hearer folk construct? “Traditional analysis of saying and what gets said seems tacitly committed to the following paradigm: Two and only two individuals are engaged together in it. During any moment in time, one will be speaking his own thoughts on a matter and expressing his own feelings, however circumspectly; the other listening. … The two-person arrangement here described … informs the underlying imagery we have about face-to-face interaction” (Goffman 1981, 129). In the spirit of Silverstein’s (1979) Whorfian analysis of Austin, one might fashion a Whorfian reading of, for instance, Saussure’s dyadic Sender-Receiver model, or of Buhler’s “I-thou-it.” Indeed, it may be that the dyadic Speaker-Hearer model of the speech event approaches what we might call a Whorfian universal.16

The grammatical category of person is canonically defined in a tripartite manner. As we have already established, person categories can be defined in terms of the participant roles that the individuals differentially referred to by tokens of each person value occupy in the discursive event of which that signal is a part. That is, persons can be defined in relation to the speech-act participant roles of Speaker and Addressee, where the 3rd person is a residual, or unmarked, category with respect to the 1st and 2nd persons.

1st person : Speaker [S] is Referent
2nd person : Addressee [A] is Referent
3rd person : Referent [R] is neither S nor A
The grammatical category of person mediates successful reference by using the structure of role inhabitance in the ongoing interaction as a matrix that offers coordinates for the act of reference. Participation in discursive interaction, then, serves as the self-evidence not only of role occupancy (i.e. Speaker versus Addressee versus Referent) but also of personhood, a performative category ratified by languaging rather than a natural kind. (I think this is the gist of Benveniste’s argument about subjectivity.)

I will now argue that it is the particular way in which the “true” person categories are anchored to participant roles that accounts for the dyadic character of folk models of participation. It was, again, Benveniste (1966, 233, discussed in Silverstein 1976, 38; Cysouw 2003, 70) who observed that 1st and 2nd nonsingular forms have as their default interpretation an associative plural reading. The denotation of the English plural noun chairs can be represented iconically as a set with a structure something like this: {chair + chair + chair …}. That is, it can be represented as a set that contains more than one entity of identical type or kind. The denotation of the English pronoun we has a rather different structure; its default interpretation is not something along the lines of: {I + I + I …}. It does not denote a set of many speakers. It denotes the Speaker plus some other or others. Similarly with 2nd person plural forms—y’all does not pick out a set {you + you + you …}. In fact, no language ever described distinguishes a 2nd person plural limited only to true addressees from a 2nd plural that has a default associative reading (Cysouw 2003, 296; Wechsler 2010, 335).17 The important point for our purposes is the following: Speaker and Addressee are point-like nodes that anchor referential indexical functions; Speaker indexing and Addressee indexing are monadic functions. We can see this by comparing three typologically robust paradigmatic structures of person marking (modeled after Cysouw 2003).

In figure 5 we compare three types of person marking. Relatively underspecified systems, like that of French (see paradigm 1 in fig. 5), exhibit a basic distinction between singular and nonsingular. In such languages, the 1st nonsingular form does not specify whether Addressee is a referent or not (for this reason it is unmarked in the table). Similarly, the 2nd nonsingular form is underspecified in terms of whether the plural set consists exclusively of co-present individuals or of some co-present individual(s) + some nonparticipant(s) (this is also the case with 2.augment forms). The 3rd plural is the only plural in this paradigm that specifically indicates that more than one member of a given role participant category (S, A, or R) is included in the referential set. This follows from the character of the 3rd person as a category negatively specified for either S or A reference. In more elaborated paradigms (for instance, Tamil) a distinction is made between 1st exclusive and 1st inclusive nonsingulars (see #2). The exclusive (as well as the 1st minimal augment) includes Speaker and excludes Addressee from the referential set. The inclusive includes Speaker and Addressee in the referential set, though possibly others as well. Paradigm 3 schematizes so-called minimal-augment systems. A 1|2.minimal form (for instance, Ilocano) specifies just the Speaker-Addressee dyad, and in this sense has the equivalent denotation as a 1st person inclusive dual (setting aside important differences in paradigmatic patterning).

Figure 5. 
Figure 5. 

Three cross-linguistically common pronominal paradigms.

In summary, and as illustrated in figure 5, despite differences in the paradigmatic structure of person marking, person marking in language always treats Speaker and Addressee as monadic values. This may reflect the semiotic architecture of indexical reference. Just as the event of speaking has a point-like characterization in temporal deixis, Speaker and Addressee indexing has a point-like structure. But this interpretation should not be assumed (the curious reader is referred to Wechsler [2010] and Harbour [2017] for important alternative interpretations of this universal).

The folk intuition that speech is, at its heart, instantiated in a dyadic relationship between a unitary speaker and a unitary addressee is likely motivated by this monadic character of referential participant indexicality. Note that, as with other Whorfian distinctions seen as conditioning forms of habitual thought, the associative plural character of person deixis is covert, revealed only in the formal syncretisms between two distinct readings of the 1st and 2nd person nonsingulars (i.e., as sets of individuals who are exclusively co-present or not). The Speaker-Hearer-Referent folk model of participant roles seems to be motivated by the monadic indexical structure of participant deixis revealed in closely attending to the possible extensions of 1st and 2nd person nonsingulars as compared with 3rd person nonsingulars. This intuitive tripartite division of participants, into Speaker and Addressee roles in the signaling event (Es) plus the Referent of the narrated event (En), draws upon the logic not simply of person categories as such but also upon the point-like semiotic architecture of referential person indexicality.

Speaker-Addressee-Referent Social Indexicality

For many social indexicals, indexical focus can be neatly and exhaustively specified in terms of this cross-linguistically convergent tripartite set of emic participant roles of Speaker, Addressee, and Referent that correspond with person categories. For instance, in typological study of the phenomenon, mappings of sex-based gender features onto each of the roles of S, A, and R and their combinations are attested (see fig. 6). Where only the gender of referent is specified, we are safely on the solid ground of (sex-based semantic) “gender” as a canonical category of classical grammar (Corbett 1991). Semantic gender assignment for human nouns often varies depending upon the presupposed social gender of the discourse Referent (see Wechsler’s [2009] treatment of French or so-called hybrid nouns in Corbett 1991). But other languages index the same social information nonreferentially. As examples of one-place or “absolute” (Levinson 1983) nonreferential gender indexicals, take Basque and Karajá. In Basque (isolate), verb-final particles presuppose the social gender of Addressee regardless the gender of Speaker or Referent—thus diagok ‘he/she/it stays’ (addressed to a man) versus diagon ‘he/she/it stays’ (addressed to a woman) (Alberdi 1995). In Karajá (Macro-Gê), morphophonological variants—in particular, [k] versus [Ø]—presuppose Speaker gender independently from the gender of the Addressee or the Referent (Fortune and Fortune 1975).

Figure 6. 
Figure 6. 

S, A, and R roles in the indexical focus of categorical gender indexicals.

Occasionally languages specify relational gender features for nonsingular referents. So, for instance, in Muhiang-Arapesh, subject cross-referencing prefixes on the verb in m- mark masculine plural, in w- the feminine plural, and in s- a mixed gender or unknown gender plural (Alungum, Conrad, and Lukas 1978; cf. the Icelandic neuter). Sex-based gender in Muhiang-Arapesh is a semantic category; it exclusively characterizes the social gender of discourse Referents. Functionally parallel relational or two-place nonreferential gender indexing is also attested. In the well-known case of Yana, a number of morphophonological variants distinguish the speech of men addressing men—that is, of masculine Speaker-Addressee dyads—from all other Speaker-Addressee gender pairings. One such variant is the addition of a semantically empty word-final suffix, -na, to “noun forms which do not end in a short vowel in the theme, all monosyllabic noun themes, demonstratives, and a large number of verb forms” (Sapir 1963, 209). Finally, in a number of languages 3rd person anaphors exhibit a nonreferential gender distinction. One example is Aoheng, an Austronesian language of Borneo (Sellato 1981, cited in Blust 2009; cf. Rose 2013). In Aoheng, the 3rd singular anaphor does not encode semantic gender. There is, however, an additional form, ana, employed by male Speakers in referring to masculine Referents, which thus functions as a nonreferential Speaker-Referent gender indexical.

As these examples illustrate, categorical gender indexicality fits tongue-in-groove with the Speaker-Addressee-Referent triad of speech participant roles. We know that this is an accurate description of participant roles for these cases because of the way in which these gender indexicals function in reported speech constructions. In such contexts gender indexicals correspond with represented SpeakersEn and AddresseesEn rather than to those in the here-and-now event of signaling (compare to logographic pronouns for referential person indexicals).18 Categorical gender indexicals are highly defeasible indexes in the sense that their default values are easily recentered so as to characterize the social gender of the represented Speaker or Addressee of some En in quotative contexts without any residual signification of Animator or Recipient gender in Es. They parallel explicit performatives in the sense that they offer highly specific characterizations of social gender, but are effete in terms of their force.

Indexical Focus, Participant Role Denaturing, and Rigid Performativity

We can compare such highly “presupposing” nonreferential indexicals of social gender to social indexicals that can be given a functional characterization as rigid performatives. As we have seen, the performative effects of curse words in Es are not defeased even when reportively calibrated into some En. This problem of pragmatic perdurance is reflected in the specialized repertoire of citational forms used to report upon—without replicating the effects of—English curses and blasphemes (e.g., “the F-word,” “four-letter word,” etc.). Take, for instance, the following exact reproduction of a passage from an article posted on the ESPN website about Rutgers coach Mike Rice’s treatment of his players captured on videotape: “In addition to Rice’s physical actions seen in the practices, Rice calls Rutgers players ‘f----ts,’ ‘m----- f-----s,’ ‘p-----s,’ ‘sissy b-----s,’ and ‘c---s,’ among other epithets” (van Natta 2013). Hyphens here operate as orthographic noise—each hyphen standing for each orthographic character that has been redacted. Specialized citational forms like these avoid replicating the taboo effects of the very performative formulae whose occurrence they serve to report. In cases like these, the entailments of social indexicals in Es are not defeased by reportive calibration. It is not just Speaker (here, Mike Rice) who is “responsible” for the obscenities; speech Animators (here, ESPN) are also indexically soiled by the dirty language they report.

In quotations like this one there is always the risk of a double indexicality—a violent speech act is attributed to the emic role of represented Speaker, but where it is not hedged by linguistic avoidances it also returns to the etic role of speech Animator. Such dynamics are not, of course, limited to English curse words, slurs, and obscenities. To illustrate this double indexicality (analogous to the Speaker/Animator duplicity that we have just seen) for the targeting of the role of Addressee and Referent, I draw on linguistic registers geared not toward the indexing of social gender but rather to the management of in-law (affinal) relations. Here again, the double indexicality of rigid performatives targets both an emic role (Addressee and Referent) and its etic ground (Recipient and Nonparticipant, respectively). Those with a passing familiarity of the ethnological literature will have read descriptions of affinal avoidance relationships that are commonly hedged around with taboos material, interactional and linguistic. Drawing on materials from Korowai (Papuan), Stasch (2003) observes that these multimodal practices effectuate motivated icons. Not saying mother-in-law’s name is like not seeing her, which is like not touching her, and so on.

Let’s focus on the specifically linguistic dimension of these affinal taboos. In many linguistic communities name avoidance is enregistered as an emblem of respect toward Referent. The same is true in Korowai mother-in-law avoidance. Only here, just as with the senseless or purely expressive use of English curse words discussed above, the pragmatics of personal names is not determined/defeased by their semantico-referential properties: “Between a man and his affines … there is a formal prohibition on name utterance, and mother-in-law and son-in-law pairs are the most careful observers of this prohibition. Since many people’s names are high-frequency words with independent meaning, avoidance of a son-in-law or mother-in-law’s name often involves artful work of circumlocution” (Stasch 2003, 323). Homophones of the name as well as the name itself are avoided. In such homophone avoidance we find that classic character of taboos to spread iconically; here “we find prohibitions on the use of … names … which ‘contaminate’ any words with a phonetic resemblance to these names” (Lévi-Strauss 1966, 176–77). Semantic sense or discourse reference does not succeed in defeasing the pragmatics of these expressions encircling the mother-in-law’s name. Once again there is a double indexicality. Respect toward mother-in-law is accomplished in acts of reference that avoid the use of her proper name; this is a canonical Referent-focal honorification of the kind studied by Brown and Ford (1961) for English. But respect is also enacted through the avoidance of lexical items with the same phonemic shape as the name but that do not refer to her person, possessions, or actions. Since mother-in-law here need neither be a discourse Referent nor an Addressee nor a Bystander, she occupies a maximally unmarked participant role; a homophone of the name of the mother-in-law is a nonparticipant index (cf. Irvine’s [2009] “remote focus”). Here Referent-focal indexicality falls back into the etic space of Nonparticipant indexicality; that is, it falls back into the etic space that serves as the condition of possibility for indexing persons (whether referentially or nonreferentially) in the first place.

For Nuaulu (West Ceram), the relationship between sau monne ‘sacred relations’ (genealogically cross-sex sibling’s spouse and spouse’s same-sex sibling and their reciprocals) involves name taboos of this kind: “These people cannot say each other’s names or even words that are partial or complete homonyms of their names. There are also behavioral restrictions that include not eating from a plate that the other has used, not eating the other’s leftover food or chewing their leftover areca nut, and not talking loudly, joking, or cursing in their presence. Two people who are sau monne to each other are permitted to talk as long as they are not near each other. They should be across the room from each other but talk softly” (Florey and Bolton 1997). In Nuaulu namesake and name-homonym avoidance, the taboo words whose use risks performatively rupturing affinal relationality have the appearance of being submerged in the phonetic surround of the phonemic form that serves as the rigidly indexical designator (i.e., name) of the sau monne.19

In figure 7, the dotted line encloses the form type (i.e., Tukanesi) employed to refer to the indexical target (i.e., the in-law), while the solid line encloses forms whose tokens performatively insult the indexical target (cf. fig. 3). The use, by Tukanesi’s sau monne, of words like tuka ‘make’ and nesie ‘left’ can be performatively rupturing of the affinal relationship even though Tukanesi is not a topic of discourse. The taboo words and their enregistered substitutes are nonparticipant indexicals.

Figure 7. 
Figure 7. 

Nuaulu words tabooed in the idiolect of Tukanesi’s sau monne. Area enclosed by a dotted line indicates a domain of denotational function, whereas areas enclosed by a solid line indicate a domain of performative function.

Aboriginal Australian mother-in-law registers represent yet another functional organization of verbal taboo and avoidance. Formally, mother-in-law vocabularies consist of large lexical repertoires. Functionally, they have a quite complex discursive instantiation. This can be seen in figure 8 for the Wik language of the Cape York Peninsula (pattern reconstructed from Sutton [1978] and Thomson [1935]).

Figure 8. 
Figure 8. 

Gradient avoidance of everyday speech, and corresponding usage of mother-in-law language, by kin relationship of indexical origo to indexical target in Wik. An x indicates the normative obligation to use mother-in-law language, and an asterisk (*) indicates a taboo on use of all speech (i.e., the avoidance of address toward the kin relation). The kin propositus is here equivalent to indexical origo and kin referent to indexical target. Diagram is from the perspective of a male ego and speaker.

In mother-in-law languages the range of indexical focus types across which speakers employ the avoidance vocabulary is an iconic index of how heightened the avoidance relationship is between origo(=Speaker or Animator) and target(=Referent, Addressee, or Recipient). For the most tabooed social relationships, like the one between mother-in-law and son-in-law, the mother-in-law language was supposed to be employed whenever individuals in this relation found themselves in each other’s co-presence. The mother-in-law language functioned here as a recipient-focal index. For other less ritually sensitive avoidance relationships—like the relationship between an elder brother and a younger sister—the avoidance register was employed in address but not in all co-present linguistic communication. For still other respect relationships, like the adjacent (disharmonic) generation consanguineal relationship between father and daughter, mother-in-law vocabulary would be employed in reference to the actions of the relative, while elsewhere everyday vocabulary would be employed.

I call honorific or avoidance registers of this kind fluid focus systems (Fleming 2016), since tokens of the same honorific types have different indexical focus characteristics depending upon the identity of the indexical target. The potential for conflicting interpretations of honorific values are reduced by the following de facto pragmatic hierarchy of indexical focus types:

RecipientEs > AddresseeEs > ReferentEn
This hierarchy reflects the fact that recipient-focal usage functionally suspends addressee-focal usage, and both recipient-focal and addressee-focal usages suspend referent-focal usage. If one addresses an adult cross-sex sibling, then all referent-focal indexing of social relationship (e.g., use of everyday speech in referring to the actions of a joking relation) must be suspended if one wishes to respect one’s cross-sex sibling. If one is in the co-presence of a mother-in-law, then all addressee-focal indexing of social relationship (e.g., addressing an alternate [harmonic] generation same-sex consanguine) should be suspended if one is to respect one’s mother-in-law. Observe that in addressee- and recipient-focal uses, social indexicality is again mediated exclusively by interactional text without any appeal to the content of the denotational text. Further, in the most taboo relations even the emic interactional role of Addressee is denatured into the would-be etic substance of recipienthood. Mere co-presence (regardless who is the Addressee) demands that the mother-in-law vocabulary be employed. Indeed, in this recipient targeting there is a suspension of all register shifting, giving such speech a monologic quality about which Merlan has written with piercing insight: “The mother-in-law avoidance taboo, in which the highly prescribed and institutionalised forms—aversion of gaze, relative taciturnity, special proxemics, and so on—tend to give one aspect of the relationship between people overriding determination of their conduct in each other’s presence. Thus, a man in his mother-in-law’s presence finds it difficult to behave towards her in any way other than as her son-in-law; and further, his conduct towards everyone else on the scene is very strongly shaped by their co-presence, and the social emphasis placed on it” (1997, 108, cited in Stasch 2003).

In all of these cases (i.e., in ESPN reporting, Korowai name taboo, and Wik mother-in-law avoidance), rigid performativity spills out over the tripartite interactional role framework of {Speaker-Addressee-Referent} that so exhaustively mediates explicit performativity. We have already shown that English curse words illustrate the Animator > Speaker suspension, but it should be observed that instances of the Recipient > Addressee and Nonparticipant > Referent suspensions can also be observed in anglophone contexts. For instance, the interdiction against cursing in the co-presence of young children involves a categorical rule conditioned by recipienthood. We have already seen that lexemes phonologically iconic with curse-words though semantically unrelated to them may condition analogous pragmatic effects (e.g., fuck, fudge, fooey). But homophones of curses, slurs, and epithets may also be subject to avoidance or, inversely, linguistic play of various sorts. The adjective niggardly is now so rare and so tightly associated with the N-word that one must imagine that those who employ it (like the last users of the epicene he) are reactionaries, trolls, racists, or some combination of the above. For cases of titillation, think of jokes like “What do you get when you breed a bulldog with a shitzu?” or the entire branding concept of F.C.U.K. (French Connection United Kindom).

3.  Symbolism and Naturalization Runaway

I thank the reader who has made it this far. I hope that there is some reward among the weeds. The theme throughout has been the tension between symbolico-semanticity and performative-pragmaticity. This tension reaches its apotheosis in phonologically based taboos, cases where performative function is no longer tethered to semantic sense or discourse reference. At the end of part one we discussed the formal-functional correlation between phonologically based substitutions of curse words (i.e., signifier-iconic forms) and their use in purely “expressive” functions that are semantically vacuous (i.e., signified-empty functions). We saw that phonological substitutions (e.g., shit, shoot, shucks) only occur when curse-words are in purely expressive function, but not where a constative-semantic function is involved (i.e., “a piece of shit/*shoot/*shucks”). We have just seen that a similar phenomenon often emerges in the surround of names where these function as rigid performatives (see Fleming 2011 for a fuller treatment). A Nuaulu speaker who avoids the name (Tukanesi) of his or her in-law also avoids saying words (e.g., tuka ‘to make’, nesi- ‘tooth’, nesie ‘left’) that are signifier iconic with that name but that have no similarity in terms of their semantic signifieds or reference. If with EPs, the signal is functionally doubled into illocutionary-act-accomplishing and symbolic-description-producing functions, in these RPs there is a formal doubling where distinct but phonologically iconic forms participate in a shared rigid performativity (see fig. 9).

Figure 9. 
Figure 9. 

Formal and functional doubling of RPs and EPs, respectively

Phonologically based “rigid performativity reflects and reproduces a conceptualization of pragmatic function which, divorced from (lexical-)sense and (discourse-)reference, is left only with the phonetic substance of offending tokens to anchor its rationalizations” (Fleming 2014c, 64). Here ethnolinguistic ideology is precisely decoupled from the referentialist biasing that typically channels its rationalizations. In these cases, indexical causality cannot be understood as mediated by sense or reference, and so it becomes ideologically anchored to the phonological (graphic, gestural, etc.) substance of the signifier (what Hjelmslev [1961] called the “expression substance”).

Synchronically, the negative repertoire of name-based avoidance registers (i.e., the set of words tabooed in the speech of affines) consists of a set of phonologically associated lexemes that share gestalt phonotactic resemblances to one another. Though the pattern suggests to native folk consciousness that the shared rigid performativity of the elements of this repertoire adheres in the sounds they share, in reality repertoire entries are conventionalized as avoidance targets in more or less explicit fashion (see Elmendorff 1951; Keesing and Fifi’i 1969; and Treis 2005 for ethnographic descriptions of the explicit metapragmatic proscription of homophones). As Maarten Mous (2001) astutely observes, avoided names and avoided (near-)homonyms are never related to one another via discrete and categorical criteria of phonological likeness. To draw again on the Nuaulu data, it isn’t possible to give a description of the would-be necessary and sufficient phonological conditions for tabooing a word: “Usually when three consecutive letters of two words are the same, these words are considered homonyms. However, this is not always the case. Pina ‘female’ can be replaced by tahina, while seite is considered a homonym of Seleputi, even though there are only two consecutive letters in common. Furthermore, one person who cannot say hunane ‘moon’ does say hunahane ‘gold’, even though the first four letters are the same” (Florey and Bolton 1997). The relationship between the tabooed name and the tabooed (near-)homophone is not a simple iconic relation where an identical phonetic substance is shared. It is a configurational or diagrammatic relationship between phonetic elements in the source and target forms that serves as the affordance for certain words being ritually proscribed (see Mitchell 2018).

Nevertheless, globally, the patterning of the negative repertoire of name-based avoidance registers suggests a principle of likeness, a performative participation adhering in the sensual qualia of phonetic sound itself as it striates distinct and semantically unrelated words and expressions. The aggregate effect is to achieve the self-evidence of a semiotic naturalization (rhematization) that interprets what is in reality a lexically anchored rigid performativity (dicent indexical legisigns) as a sonic participation in the taboo name. Here the naturalization runaway of name taboos reaches a limit—rigid performativity is not functionally identified with pure phonetic substance itself. All sonically similar forms are not tabooed—places of articulation are not proscribed, phonemes are not purged. Practice does not conform to the essentialist folk picture. This final atomic denaturing of sounds (signs, graphemes) into pure performativity—the vanishing point at the horizon of language structure—never comes to pass. So what is this limit on rigid performativity without which there could be no flourishing of linguistic symbolism?

Name and Homophone Tabooing as Naturalization Runaway

Before answering this question, I would like to reframe this problem in a semiotic metalanguage. As I have shown elsewhere, name and homophone tabooing is a widespread pattern cross-linguistically (Fleming 2011, 2014a; see also Simons [1982] for a survey of Austronesian cases). This suggests a motivated, ideologically mediated diachronic pathway from name avoidance to lexical taboo (Fleming 2014c). It is this historical pathway, taken as a whole, that I am characterizing as a process of “naturalization runaway.” In order to understand why names—and not other lexical categories—are so often subject to this sociopragmatic reanalysis, we must note that the particular semiotic dissemination characteristic of names involves a tightly circumscribed semiotic-functional polysemy not unlike that of metapragmatic verbs. By the “semiotic-functional polysemy” of EPs, I refer to their capacity to function either as (metapragmatic) symbols or as (performative) indices. Names, like Benveniste’s pronouns, are empty signifiers, given referential anchoring only when an individual is made to occupy the place of indexical object. In the case of pronouns this is accomplished by speech participant role inhabitance. In the case of names it is accomplished by ritual events of baptism and speech chaining. Names can either be underspecified—as in conversations about baby names (e.g., “I like the name Sarah”)—or referentially bound (e.g., “Sarah lives in New Orleans”).20 Schlücker and Ackermann (2017) propose the terms proper noun and proper name, respectively, for these distinct linguistic entities. The social pragmatics of names slips and slides over this semiotic architecture. Two distinct, but cross-linguistically common, phenomena illustrate these slippages.

Referential Extensions of Names

Ritual namesake relations illustrate a referential extension of names beyond their original rigid designations. Name sharing may simply manifest as a pattern of name transmission that fits with ideologies of kinship (e.g., the “identity” of alternating generations) or it may be dynamically and strategically manipulated as another resource for “alliance” (e.g., Ju|’hoansi [Lee 1986] and Inuit [Guemple 1965] name sharing). These ritual relations often involve the deictic recentering of reference to namesake referent; an individual will call his or her namesake’s kin by the kin terms that this other would employ (recentering of kin term origo) or will be referred to by their kin with kin terms that would be employed for that other (recentering of kin term referent). Deictic recentering tropically identifies the two bearers of the name. A namesake is addressed (or addresses others) as if he or she were his or her namesake. (The trope is most legible in vocative or other addressee-referring contexts, where the mismatch between presupposable and entailed kinrelationality is manifest in the contrast between the individuals inhabiting Speaker or Addressee role and the kinterm employed.)

Performative Extensions of Names

Names are often potent indices of the social relationship between Speaker and Referent (Brown and Ford 1961; Fleming and Slotta 2018). In in-law avoidance, the use of the name in reference not just to taboo relations but also to their namesakes may similarly be understood as disrespectful (see Tuite and Schulze [1998], 378, for an example of namesake avoidance from the Caucasus). Here the name may be proscribed regardless the reference of tokens. And indeed, as we have seen, these performative pragmatics may even affect homophones of the name. This is the phenomenon of interest to us here, since it produces a rigid performativity that propagates or disseminates to other elements of the code beyond the name.

Historically, homophone tabooing seems to emerge through a process of naturalization runaway, which can be sketched out in terms of the following stages, each one broken up into events of (a) metasemiotic modeling (or ideological apprehension) and (b) object-sign patterning (or discourse instantiation):

Stage 1

a.  A language-structured, but referentially bound, name (symbolic-indexical legisign) is interpreted as a nonreferential sign of disrespect (indexical legisign) via dicentization “downshifting” (Ball 2014).

b.  The name is avoided and a kin term employed in its place. For example, referring to a senior consanguineal kin by bare first-name is often interpreted as disrespectful (Fleming and Slotta, forthcoming).

Stage 2

a.  A referentially unbound name (a lexical symbol) is rank-shifted into an indexical legisign of disrespect via rhematization “downshifting.”

b.  A substitute name (e.g., “no-name” in Central Australia [Nash and Simpson 1981]) is employed in referring to namesakes of the indexical target. For example, referring to namesakes of a deceased person by name is considered disrespectful to the kin of the deceased among the Karok, the Tolowa and many other native Californian language groups.

Stage 3

a.  Homophones of the name (lexical symbols) are rank-shifted into indexical legisigns of disrespect via rhematization “downshifting.” (Phonotactic resemblance is interpreted as performative identity.)

b.  Repertoires of similar sounding words are tabooed. For example, the homophone taboos discussed for Nuaulu, above (see fig. 7).

The question that I posed at the end of the last section is why this naturalization runaway process stops at stage 3. At each stage, an ideological model feeds back to effect an actual change in the functional instantiation of object-signs. In stage 3, however, the model ceases to correspond to its functional instantiation. Ideology ceases to be a transparent representation of n + 1 indexical function2 (terminology after Silverstein 2003). Homophones are interpreted as having their force by virtue of their iconic participation in the phonetic substance of the name of, for example, a taboo affine, an ideological model that corresponds to what Irvine and Gal (2000) have called ideological rhematization (originally “iconization”). The question is why this ideological model doesn’t actually lead to a corresponding change in functional instantiation. Given this rhematic interpretation of name and homophone taboos—the essentialization of performative function in the phonetic substance of the name—why don’t the particular phonemes making up the names of affines actually and really come to be tabooed?21 It is true that this question emerges from an esoteric set of empirical materials. But I think that it raises a much more general and essential point about performativity and the conditions of possibility for the flourishing of linguistic symbolism.

Double Articulation as a Block on Naturalization Runaway

As we have just observed, for rigid performatives, the equating of local signifying function with sound (whether in sound symbolism or in expressive morphology) is always limited in its scope and complex in its functional articulation—a mediate more than immediate connection. The reason for this has to with the strange character of the atoms, subatomic particles, and chemical bonds that compose the Saussurean signifiant. As Hjelmslev observed, these elements of the “plane of expression” do not directly correspond to elements of the “plane of content.” Rather, phonemes and distinctive features have the disjunct properties of being denotationally diacritic (as illustrated by minimal pairs) but void of semantic sense in and of themselves. This characteristic of language structure—Hjelmslev called such elements figurae—is typically understood in functionalist terms, where this particular organization of the phonology-semantics interface offers flexibility in the manipulation and elaboration of the lexicon; “duality of patterning,” as Charles Hockett called it, allows for the endless development and modification of morphemes. Hockett (1960), Hjelmslev (1961), and Martinet (1960) all figure figurae in a functional logic of the economy of signifying “expression,” as having a value elsewhere in the linguistic system. And indeed, it may be that duality of patterning is in some sense selected for to provision a large lexicon and in this sense offer a cultural “fitness” to the group of individuals using such a language (Fleming 2017). But regardless those effects, figurae have another function—a semiotic function not toward symbolism (in the sense of provisioning symbols) but in resisting (ideologically transparent) semiotic naturalization; they have an antirhematization function.

The duality of patterning is a stopgap measure that prevents the naturalizing reanalysis of signifying form as signifiying qualia. As we have seen with name and homonym avoidance, the identification of the plane of expression with performative (iconic-indexical) function takes place at the level of the phonotactic gestalts making up words and expressions and not directly in terms of the basic components of that plane of expression. Hjelmslev characterizes semiotic systems where minimal units of the plane of expression are mapped onto units of the plane of content as monoplanar systems. In such monoplanar systems, the naturalization of iconic-indexical functions could potentially affect the basic elements of the signifying inventory of the language in question. (Think here of red as a signifying expression in road signs. Red always has an iconic-indexical expressive “connotation” of danger whose perlocutionary effect is to put the indexically centered motorist in a state of alert. And this immediacy of signifying is likely adaptive for real-time signaling on autoroutes where the specificity of road signs is not as important as their perlocutionary efficacy. Evidently, phonemes in English [e.g., /s/, /t/, /n/] have no such connotations.)

In dually patterned languages, phonemes do not preserve their properties of signification as they are recycled iteratively to fashion diverse entries in the lexicon. Phonemes do not preserve their semantic significations when they are decontextualized from the morphemic units that encode them. Putting this in the same terms we employed to discuss rigid performatives, phonemic “decontextualization” (which is to say the recycling of phonemes across elements of the linguistic code) defeases the signifying properties of the units of the “expression plane.” Here, duality of patterning serves as a semiotic assemblage that produces this defeasance, a defeasance central to the essence of linguistic symbolism in its delocutive entelechy. The defeasance of signifying functions effected by the double articulation of language stops the runaway naturalization processes whereby symbolic functions become reanalyzed as rigid performatives from percolating all the way down into the minimal units of the “expression-plane” of the linguistic code. The Saussurean “arbitrariness” of the linguistic sign is not only a negative statement—that the signifiant is “unmotivated” with respect to its signifié. It is an effect iconically figurated by this configuration that hypertrophically emblematizes arbitrariness by frustrating any one-to-one association between figurae and symbolic signifying functions, rendering linguistic form just effete enough for symbolism to flourish.

We can situate this intuition with respect to what we might call “Parmentier’s paradox”: Following Peirce’s hierarchical classification of sign types it appears that only semiotic “naturalization” and not semiotic “conventionalization” should be a realizable sociosemiotic process. And yet clearly conventionalization, from artifice to aesthetics, is often realized as a diachronic pathway of semiotic-ideology/-practice dialectics (Parmentier [1994], cited and discussed in Ball [2014]). For Parmentier, semiotic naturalization is defined as a situation where the interpretant models the relationship between the sign and the object to be of a more elementary type than it really is (e.g., an index is represented as an icon). The best studied cases are those of rhematization (Irvine and Gal [2000]; but see Ball [2014] on dicentization). Often rhematization involves a kind of semiotic “essentialization” since sign (e.g., a dialect) and object (e.g., identity group) come to be understood to share properties, an essence, in common. In terms of semiotic practice, naturalization diagrams the relationship between index and object to be maximally motivated. In Judith Irvine’s magisterial descriptions (e.g., 1990, 1993, 1996) of caste-based speech registers, we see that speech differences between Nobles and Griots are not only interpreted as signs of maximally differentiated groups, they manifest as maximally distinguished object signs (e.g., the tempo [slow/fast], the volume [quiet/loud], the richness of noun class marking [rich/impoverished], etc.). Diachronically, in good Hegelian fashion, ideology is rendered partially transparent as enregistered object signs of caste become diagrammatic with the metapragmatic model that interprets them. Semiotic naturalization is here making the register repertoire an indexical icon of the “natural” and “essential” relationship between the indexical sign (speech register employed by Speaker) and its object (caste identity of Speaker).22

Semiotic conventionalization would theoretically involve an opposing movement—an index would be represented by its interpretant as a symbol. The problem is that such a semiotic configuration isn’t permitted by virture of the way the Peircean triads combine. An Argument must incorporate a Symbolic Legisign—an *Argumentative Indexical Legisign is an impossible Peircean sign type. So how is “conventionalization” semiotically instantiated? Through mechanisms that impede semiotic naturalization—through anti-naturalization. The phoneme has just such an anti-naturalizing function. To say that such and such a consonant is phonemic or that such and such a phonological feature is distinctive in a given language is to say that it has a “psychological reality” as a type, or legisign, for speakers of that language. As a “denotationally diacritic” type (Agha 2007, 108), the phoneme or distinctive feature distinguishes the morphemes of which it is a component part from other morphemes in the system. Figurae (qua figurae) function as indexical legisigns. But though these minimal units of “expression” are diacritic or indexical of symbolic types, as we have seen, there are quite severe impediments frustrating the rhematization of this indexical relationship. This is because each unit of form is diacritic of indefinitely many semantically unrelated morphemes. This organization of the formal inventory of a language serves to defer the performative naturalization or essentialization of those elements precisely because it brackets and relays the relationship between signifier and signified. Language structure not only produces symbolism, it is architecturally designed to shelter symbolism from the iconic-indexical semiosis in which it is always already entangled and interlaced.

4.  Conclusion

Propositionally rich symbolic communication depends, it is true, upon its contextualization. But just as it depends upon the context of interaction so too does it depend upon the decontext of denotation. That is, it depends upon a decontextualizability of symbolic function that is always vulnerable to indexical entanglements. In language as spoken, written, or signed these indexical entanglements are omnipresent. The point we have tried to make is that where those entanglements become undecontextualizable these may submerge the semantic functions of linguistic signals in their pragmaticity, make denotation overburdened by their interactional effects.

Put in these terms, this essay might be read as a reflection on how social indexical functions differentially contour the cultural practices that constitute the “entextualization/co(n)textualization process” (Silverstein and Urban 1996, 3). Cultural participants “engage in processes of entextualization to create a seemingly shareable, transmittable culture. They can, for example, take some fragment or discourse and quote it anew, making it seem to carry a meaning independent of its situation within two now distinct co(n)texts. Or they can transcribe a fragment of oral discourse, converting it into a seemingly durable and decontextualizable form that suggests to interpreters a decontextualizable meaning as well” (2). As the enduring interest of linguistic anthropologists with voicing, reported speech, and intertextuality make manifest, the re-entextualization of efferent signals depends upon the possibility of their decontextualization. Signals that are undecontextualizable resist, in the various ways and dimensions outlined above, these processes of re-entextualization. Although entextualization/co(n)textualization is inherently a discursive phenomenon, through the mediation of processes of enregisterment there may be a telescoping of parole into langue, of conduct into code. That is, elements of the linguistic code are stereotypically understood (i.e., enregistered) as indexing certain contextual arrangements. For the case of normatively negatively valued rigid performatives, this percolation up from token to type is revealed as a persistent threat to (symbolic) language (conceptualized as an always already enregistered repertoire). Our recognition—whether as “informants” or the “informed”—of the abstractable symbolic properties of sense-bearing units of the linguistic code depends as much upon the decontextualizability of those units as it does upon their discursive co(n)textualization. (Symbolic meanings are necessarily decontextualizable meanings.) This (folk) analysis—central to the referentialist biasing of secondary rationalizations about language—is only possible where the “communication” of iconic-indexical resonances of register repertoire elements across events of signaling is socially sanctioned. Here, note, we do not even speak of a sanctioned citation (in the everyday sense of quotation) but of signal replication as such—an i(n)tera(c)tion of elements of the code across events. All of this is to say that symbols (cultural-texts-in-miniature) similarly have as their condition of possibility the entextualization-co(n)textualization process.

The objection might be raised that verbal taboos (functionally defined in terms of pragmatic indefeasibility and proscription) are a specialized vocabulary in Standard Average European languages. Perhaps the purported danger of rigid performativity to linguistic symbolism can be discerned in the “exotic” speech communities of Aboriginal Australia where the entire lexicon of languages like Warlpiri (Kendon 1988), Djirbal (Dixon 1990), or Guugu Yimidhirr (Haviland 1979) appear to be enregistered as social indexicals of this kind. But certainly they will be of only marginal interest for the linguistic anthropology of “‘modern, rational’ people … [for whom] words ‘mere words,’ [are] in no way consubstantial with the thing itself” (Rumsey 1990, 354). On closer inspection this does not appear to be the case.

The differences between register variation in modern industrialized societies and register variation in Aboriginal Australia are more a reflection of differences in the social organization of registers than differences in the scope of the rigid performativity problem. What makes the study of Australian speech register variation unique is the way in which variation is so tightly sutured to intimate social relationships between Animators, Recipients, and Referents—an initiate to his circumciser, a woman to the parents of her dead husband, a man to his mother-in-law. Rumsey (1990) argues that reported speech constructions and nonconfigurational anaphora may serve as the grammatical bases for a Whorfian ideological projection that privileges a “wording” over “meaning” model of language in Aboriginal Australian speech communities. But this ideological centrality of wording is also sustained by the rich relational register variation that make clan-lects, mother-in-law languages, initiate registers, and signed mourning registers emblematic of the relationships between speech animators and recipients, and between these and the literal contexts of speaking (i.e., Country). That is, wording is foregrounded because of the salience of register shifting as an emblem of social relatedness.

But are there not register phenomena (e.g., ethnic, racial, gendered, sexualized, or class-based registers) that striate speech production in an analogous fashion in modern industrialized speech communities? Certainly, in such societies, speech variation is institutionally anchored as emblematic of animator identity, making the indexical potency of register shifts for social relationship subject to systematic “erasure” (Irvine and Gal 2000). Worse still, those registers that best conform to this model of a one-to-one mapping of speakers onto registers—the hegemonic monolinguistic model of the national identity-bearing speaking-subject—are those at the top-and-center of socially stratified systems of sociolinguistic evaluation (e.g., registers indexical of the economically privileged, of whiteness, of masculinity); this at once naturalizes the ideology of the inalienability of register and simultaneously frames code switching and register mixing as deviant, substandard.

Nevertheless, in practice, register variation is the norm not an outlier. And often, these switches do appear to be governed by a social pragmatics of rigid performativity that sociologists and sociolinguistics have alternatively analyzed under the frame of “stigma” and “hypercorrection” (Goffman 1963; Labov 1964). Those of us who have worked on the dynamics of language shift—whether in local indigenous language communities or among immigrant language communities—are intimately aware of the heightened performativity of “heritage” codes, whether as positively or negatively evaluated. As Moore writes of his experiences working with Northern Paiute, Sahaptin, and Upper Chinoonkan speakers: “My own ethnographic experience (and in this respect I think my experience is typical) strongly suggests that when a language is spoken only by a few people in a local community, and by them only on rare occasions, its ‘functional,’ symbolic, and interactional potency as a communicative medium is in fact greatly increased rather than attenuated” (Moore 2006). Though language shift is not always driven by a performative heightening of the indigenous code-as-enregistered, this is a common dynamic and one that does seem to be productive of the hypertrophied indexical “potency” described by Moore (cf. Hill’s [2002] language endangerment rhetoric of “hyperbolic valorization”). Rigid performativity here reduces the contexts of production of indigenous codes (cf. Fishman’s “functional reduction”), but augments their force (Fleming 2010).

But, and this is an important caveat, even in these most dramatic of cases, the threat of rigid performativity is never an existential one for symbolism. Symbolism thus purged seems to continually redouble itself whether in lexical doubling (as with mother-in-law vocabularies [Fleming 2015]) or in a duplication of the code itself only in another modality (as with alternate sign languages that emerge with the complete banishment of speech itself [Fleming 2014b]). (Language shift, similarly, always involves a move away from a source language but toward a target, just as “hypercorrection” is an analog movement away from a stigmatized variety but toward—even beyond—a “prestige” variety.) This symbolic mitosis is inescapable, being the very counter-sign of the emergence of the taboo function—to use the Polynesian vocabulary, the presence of a noa word marks the absence of its tapu counterpart.

Register doublings—even where they radically transform the code of a speech community—preserve symbolism. I have argued that this perdurance and preservation of the symbolic in the face of the iconic-indexical vicissitudes of language-in-cultures of semiosis rests upon a final safeguard whereby—to invoke a different Austen—the sense and sensibility of the phonemic is camouflaged. That safeguard is the duality of patterning, which brackets the relationship between sound and sense. Duality of patterning as a semiotic infrastructure undergirds this decontextualizability of symbolic legisigns, rendering more difficult (because more mediate) the rhematizing identification of signification with the atomic signifying elements of the code (distinctive features and phonemes). Indeed, this semiotic infrastructure of the phonemic is itself an emblem for linguistic symbolism. It says: Symbols are supposed to describe things without being them. Ironically, then, even the anti-naturalizing subbasement of the symbolic is an iconic indexical—a sign of symbolism itself.


Contact Luke Fleming at Département d’anthropologie, Université de Montréal, 3150 Jean-Brillant, Montréal, QC H3T 1N8 ().

Earlier iterations of parts of this article were presented and thoughtfully commented upon at the University of Chicago “Semiotics Workshop” (2011) and at the UT Austin “Symposium about Language and Society” (2014). My thanks to Asif Agha, to an anonymous reviewer, to Michael Silverstein for suggesting the terms undecontextualizable and rigid performativity, and to James Slotta. Finally, my thanks to the students in my 2018 graduate seminar “Performativité et le pouvoir des mots” for whom I drafted this text.

1. Much ink has been spilled in describing curse-words in pragmatics (e.g., Potts 2005), semantics (e.g., Allan and Burridge 1991), neurolinguistics (e.g., Jay 2000), and sociolinguistics (e.g., Vincent 1982). I am embarrassed by how little of that literature I have read. I wish to apologize to those who have had the same insights as myself but who are not cited, and prostrate myself before those who have better grasped the phenomena at hand and remain unread.

2. Here, as elsewhere in the article, we employ a simplified version of Jakobson’s notation for deixis. The here-and-now event of signaling is symbolized by Es and the narrated event by En. Our notation additionally indicates whether or not the events of narration and signaling are distinct (≠) or identical (=). Explicit performativity involves that special case where En = Es. (Silverstein 1993 calls this a “reflexive calibration” of indexicals onto the event-of-signaling.)

3. Perhaps one could bring in here Austin’s “theatrical” exception, which so animates Derrida’s (1971) essay Signature, événement, contexte. And it is certainly the case that on the stage the handshake “merely” stands for a handshake. Two differences still remain with EPs: (1) the theatrical handshake still also counts as one, in some sense different than saying, “He shook her hand” counts as a handshake, and in this sense is like the rigid performatives described below; (2) the theatrical citation of handshaking is achieved by a global framing (the playbill, the theater seating, and the track lighting) more akin to a matrix verb of speaking, if we push the analogy to linguistic performativity, than to metapragmatic verbs that themselves count as the matrix verb of speaking and the predication of the act (cf. “She wrote ‘I bequeath to Rudyard my gold watch …’” with “She bequeathed to Rudyard her gold watch”).

4. This asymmetry between a relatively impoverished metapragmatic lexicon and an indefinitely rich pragmatics, is an important characteristic of metapragmatics as a framework for interpreting (non)languaging. It is the cause of no end of confusion for speech act theorists after Austin, cf. Searle’s (1975) discussion of “indirect speech acts.” See discussion of this theme in Agha 2007, 55–64.

5. Here I use Silverstein’s distinction between function1 and function2. “Let us call [the] goal-directed and sometimes goal-achieving categorization of occasions of use the function1 of language. … Insofar as function1 is externalized in verbalizations about language … it implies a metalinguistic function1 for language itself. … Let us call [the] indexical quality of [tokens of] speech forms, or indexical mode of their signification, function2” (1979, 206).

6. Again, nonlinguistic examples help to put into relief this relationship that can be difficult to perceive because of the formal identity of the signal that functions both as a symbol and as an indexical. The interested reader should consult Tambiah’s (1984, 74) analysis of Evans-Pritchard’s Azande materials. Take, for instance, his example of the tardy traveler who places a circular rock in the branches of a tree to retard the passage of the sun across the sky. Here we have little trouble seeing that the rock in the tree is an icon of the act, retarding the sun in the sky, which the magical rite seeks to accomplish. In a manner that is wholly parallel, Silverstein is arguing that the explicit performative is an icon of the metapragmatic symbol.

7. Note that this is not the case with anaphors, where one must know the co-textual antecedent, or with names, where one must be socialized to the rigid designation of a name into order to identify the referent of the token. For these noun-phrase types, knowing the pragmatic rule of use isn’t sufficient for successfully determining the reference of discourse tokens.

8. But note that if taboo occurrences force a recontextualization of the social happening in which they manifest, this is also to say that they resist decontextualization, since signal repetition always entails perlocutionary replication. See the conclusion for further discussion of this theme.

9. See Stasch (2009), drawing on Valeri (2000), for a discussion of verbal taboo in terms of a logic of sacrifice.

10. Though I have already assumed that fuck can operate in a purely expressive manner (and this is widely assumed in the broader literature on curse words), we should quickly present the evidence: The syntactic and derivational flexibility of fuck are tightly correlated with usages that do not contribute to the truth-conditional propositional content of utterances. This includes fucking as adjective or adverb (“It was a [fucking] mess.” or “He was [fucking] stabbed!,” etc.) or as an infix “Un[fucking]believable.” All of this is extensively treated in the generative linguistics literature.

11. In my idiolect there is only one phonologically based substitute that can replace fuck where the latter makes a semantic contribution to propositional content. That is eff, as in: “He got eff-ed up on whiskey.” This eff in question is the pronunciation of orthographic F, itself employed to phonologically disguise the delocutive noun used to refer to fuck as a lexical type (it is “the F-word” not “*an F-word”); that is, the expression “F-word” is used to predicate the use of the term without accomplishing its pragmatic effect. This is the exception that proves the rule: Rank shifting to a metapragmatic level gives pragmatic terms not only indexical functionality but also symbolic richness. Because eff is the metapragmatic designator of fuck it can participate both in its purely pragmatic functions (“What an eff-ing mess.”) and its semantic ones (“He got eff-ed up.”). Eff functions here in a manner similar to other conventional forms of metapragmatic noise—noise over the signal composing the phonic, graphic, or visual level of linguistic patterning. Think here of the use of asterisks to replace orthographic characters in the graphic-visual modality (e.g., «f * * *»), of the blurring out of the middle finger or even of the distinctive mouth gestures “readable” as cursing, or of the bleeping out of curse-words on television or radio. Indeed, bleep is another delocutively derived example of metapragmatic noise (cf. other delocutives of sounding like buzz, beep, knock), also capable (like eff) of substituting in pragmatic function1 (cf. “Knock, knock” said to announce one’s presence at an open door). In the penultimate sentence I employed it as a metapragmatic term (“The radio station has a five-second delay so that they can bleep out any curse-words”), but it can also be employed in place of the pragmatic sign (“He is a bleeping egomaniac”). Note that metapragmatic and pragmatic function are almost, but not quite, aligned: The metapragmatic term that describes the use of a signal employed to suppress an RP functions1 in paradigmatic alternation with that RP. For comparative examples of the rank shifting of metapragmatic noise, see ballishsha registers in Eastern Cushitic languages (Treis 2005) and “no name” post-mortem namesake reference in Central Australia (Nash and Simpson 1981).

12. Joking relations may be expected to speak obscenely with one another, but such speech continues to have performative effects. The force of obscenity is not defeased; rather, it is rerouted to seal and bind the social relation between joking partners. And should that rerouting fail the “jokes” fall back into “insults.”

13. It should be underlined that not just negatively evaluated but also normatively positively valued signs may conform to our functional characterization of rigid performativity. An example of this sort might be the Kabbalistic ideology of language, which sees the Hebrew language as having a performative force due to each element of its sound inventory participating in the names of G-d (Scholem [1926] 1990). We focus on negatively evaluated forms because the naturally occurring practices of linguistic avoidance that crop up in their surround serve as “reactances” that reveal the semiotic functional organization of such sign types.

14. Goffman vacilates somewhat on the category of recipient; it may be that some ambiguity is unavoidable here. Earlier in the same essay he has separated ratified and unratified recipients from bystanders, whether as overhearer (a bystander that animator is aware of) or eavesdropper (a bystander that animator is not aware of). For the time being lets stick with Recipient as a characterization that is independent of the intentions, knowledge, and so on, of the individual occupying “the role of utterance production.”

15. Note that in animal communication, there is no such distinction or fractioning of the utterance producer. Take, for instance, the famous waggle-dance of the honeybee; the signal, which informs the recipient of the direction and distance of a nectar or pollen cache, may be transmitted to—really replicated by—another signaler. This second signaler, however, has no way of communicating that the information is (like the evidential category) hearsay.

16. The possibility of Whorfian universals exposes problems in the terms of the debate over so-called “linguistic relativity.” These problems were seeded by Whorf’s use of the Einsteinian idea of the relativity of space-time for distinct observers as a rhetorical trope in discussing differences in the linguistic encoding and cognitive interpretation of time between speakers of Standard Average European languages and Hopi. Following this line of inquiry, neo-Whorfian approaches have privileged the study of how cross-linguistically variable structures (nominal classification and spatial reference, being the two biggest success stories) affect habitual thought. They have done this for two reasons. First, methodologically it is easier to show cognitive effects where those differ between users of distinct languages (i.e., co-vary with linguistic differences). Second, linguistic anthropologists tend to think of Whorfian effects as a form of relativity because such effects are counter-posed to hardwired biologically based universals of the Chomskyan variety. (At the same time, linguistic relativity can be made to rhetorically conform to the discourses of cultural relativism so popular in undergraduate-level cultural anthropology.) Indeed, formal linguists have been some of the fiercest critics of Whorfianism in any and all of its incarnations (Pullum 1991; Pinker 1994; McWhorter 2014), while prominent neo-Whorfians have been some of the most vocal critics of the Chomskyan approach (Evans and Levinson 2009; Everett 2016). Strategically, this approach effectively cedes universals to the bio-reductionists. This is strange, since so-called “functionalist” and typological approaches to language have long considered the possibility that there are aspects of language structure that are universal even though they are not specifically and differentially subtended by biologically based priming. Grammatical person as the hinge between interactional roles and denotational textuality is just one such functionally adaptive structure seen as tending toward universality. Duality of patterning might be another good candidate; it is seen as necessary for the production of a large vocabulary but may not have required specific biological adaptations other than those necessary for phonological and morphological production (Blevins 2012). Emerging village-sign languages appear to lack it (Sandler et al. 2011), so it looks like it is an emergent property of language that may appear only after generations of use. I offer an overlaid semiotic-functional interpretation of duality of patterning in Sec. 3.

17. The astute reader will observe an asymmetry between 1st and 2nd person plurals. The set denoted by a 1st plural may include Addressee, while the set denoted by a 2nd plural cannot include Speaker (e.g., Toi et moi sommes/*êtes allés). For rich discussion of this asymmetry, see Corbett (2000) on what he calls the “agreement hierarchy.” Note that many languages compensate for this asymmetry via a distinction between inclusive and exclusive 1st person nonsingulars.

18. See Fleming (2012, 311–15) for further discussion of this theme and for citations. See Rose (2013) for a description of how the use of gender indexicals in reported speech constructions in Moheño aids reference tracking.

19. I hope I don’t betray a doubt in the analysis if I hedge by saying that etic and emic are used here in a quasi-metaphorical sense. There is a trope of the emic, as that which manifests within the (categories) of the linguistic system, and of the etic, as that which is without, which is fruitful here. But inasmuch as this etic space is signified (as with “animator”-indexing) or signifying (as with the phonetic taboos) there is a corollary enveloping within—if not the langue then at least—the system of speech registers, an enveloping that never allows for a “true” (etic) exteriority.

20. “Names are [often] recycled within a language community so that many people have the same name. Why then does a name appear to be associated with a unique individual? The puzzle cannot be solved without recognizing that the cognitive regularity is underlyingly a social regularity: when we say that a name is associated with a unique individual we are saying that some people associate the name with one individual, though others may associate it with someone else. The uniqueness of pairing of name and referent is a regularity that holds for a social domain of persons” (Agha 2007, 66).

21. The name and homophone avoidance registers that perhaps most closely approach this possibility are the Southern Nguni hlonipha and Eastern Cushitic balishshaa in-law registers (see Fleming 2014a, 130–35), but even here homophone-likeness is evaluated in terms of syllable-level phonotactic resemblances, never in terms of segment identity.

22. It is true that the folk model is, in some real sense, a necessarily incomplete analysis of semiotic function, something that is revealed in those moments so susceptible to “erasure” where Nobles of relatively low status in the local Es interaction act as muted Griots, recounting stories for the entertainment of higher ranking Nobles (Irvine 1990, 150). Nevertheless, object-sign differentiation emblematizes the ideological conceit of caste-essence complementarity.